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NOVEL MOUSE POLYPEPTIDES ENCODED BY 
POLYNUCLEOTIDES AND METHODS OF THEIR USE 

Priority Claim 

[001] This application is related to the following provisional applications 
filed in the United States Patent and Trademark Office, the disclosures of which are 
hereby incorporated by reference: 



Application 
Number 


Title 


Filing Date 


60/476,632 


Novel Mouse Polynucleotides Relating to Kinases, 
Phosphatases, and Proteases 


June 9, 2003 


60/476,621 


Methods of Use for Novel Mouse Polynucleotides 
Relating to Kinases, Phosphatases, and Proteases 


June 9, 2003 


60/485,539 


Novel Mouse Polynucleotides Relating to Secreted and 
Transmembrane Proteins 


July 8, 2003 


60/485,217 


Methods of Use for Novel Mouse Polynucleotides 
Relating to Secreted and Transmembrane Proteins 


July 8, 2003 



Technical Field 
[002] The present invention is related generally to novel 

polynucleotides and novel polypeptides encoded thereby, their compositions, 
antibodies directed thereto, and other agonists or antagonists thereto. The 
polynucleotides and polypeptides are useful in diagnostic, prophylactic, and 
therapeutic applications for a variety of diseases, disorders, syndromes and 
conditions, as well as in discovering new diagnostics, prophylactics, and therapeutics 
for such diseases, disorders, syndromes, and conditions (hereinafter disorders). The 
present invention also relates to methods of modulating biological activities through 
the use of the novel polynucleotides and novel polypeptides of the invention and 
through the use of agonists and antagonists, such as antibodies, thereto. 

[003] This application further relates to the field of polypeptides that 

are associated with regulating cell growth and differentiation, that are over-expressed 
in cancer, and/or that can be associated with proliferation or inhibition of cancer 
growth, including hematopoietic cancers such as leukemias, lymphomas, and solid 
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cancers such as lung cancer, for example, adenocarcinomas and/or squamous cell 
carcinomas. These polypeptides may also be associated with other conditions, such as 
inflammatory, immune, and metabolic disorders, as well as microbial infections, 
including viral, bacterial, fungal, and parasitic diseases, disorders, syndromes, or 
conditions. -■ 

[004] This application further relates to modulators of biological 

activity that can specifically bind to these polynucleotides or polypeptides, or 
otherwise specifically modulate their activity. For example, they can directly or 
indirectly induce antibody-dependent cellular cytotoxicity (ADCC), complement- 
dependent cytotoxicity (CDC), endocytosis, apoptosis, or recruitment of other cells to 
effect cell activation, cell inactivation, cell growth or differentiation or inhibition 
thereof, and cell killing. 

[005] The sequences of the invention encompass a variety of different 

types of nucleic acids and polypeptides with different structures and functions. They 
can encode or comprise polypeptides belonging to different protein families ("Pfam"). 
The "Pfam" system is an organization of protein sequence classification and analysis, 
based on conserved protein domains; it can be publicly accessed in a number of ways, 
for example, at http://pfam.wustl.edu. Protein domains are portions of proteins that 
have a tertiary structure and sometimes have enzymatic or binding activities; multiple 
domains can be connected by flexible polypeptide regions within a protein. Pfam 
domains can comprise the N-terminus or the C-terminus of a protein, or can be 
situated at any point in between. The Pfam system identifies protein families based 
on these domains and provides an annotated, searchable database that classifies 
proteins into families (Bateman et al., 2002). 

[006] Sequences of the invention can encode or be comprised of more 

than one Pfam. Sequences encompassed by the invention include, but are not limited 
to, the polypeptide and polynucleotide sequences of the molecules shown in the 
Sequence Listing and corresponding molecular sequences found at all developmental 
stages of an organism. Sequences of the invention can comprise genes or gene 
segments designated by the Sequence Listing, and their gene products, i.e., RNA and 
polypeptides. They also include variants of those presented in the Sequence Listing 
that are present in the normal physiological state, e.g., variant alleles such as SNPs, 
splice variants, as well as variants that are affected in pathological states, such as 
disease-related mutations or sequences with alterations that lead to pathology, and 
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variants with conservative amino acid changes. Sequences of the invention are 
categorized below; any given sequence can belong to one or more than one category. 
Secreted Protein-Related Sequences 

[007] Secreted proteins, also referred to as secreted factors, include proteins 
that are produced by cells and exported extracellularly, extracellular fragments of 
transmembrane proteins that are proteolytically cleaved, and extracellular fragments 
of cell surface receptors, which fragments may be soluble. An example of a secreted 
protein is keratinocyte growth factor (KGF), which stimulates the growth of 
keratinocytes, and is useful for repairing tissue after chemotherapy or radiotherapy. 

[008] Many and widely variant biological functions are mediated by a wide 
variety of different types of secreted proteins. Yet, despite the sequencing of the 
human genome, relatively few pharmaceutical^ useful secreted proteins have been 
identified. It would be advantageous to discover novel secreted proteins or 
polypeptides, and their corresponding polynucleotides that have medical utility. 

[009] Pharmaceutical^ useful secreted proteins of the present invention 
will have in common the ability to act as ligands for binding to receptors on cell 
surfaces in ligand/receptor interactions, to trigger certain intracellular responses, such 
as inducing signal transduction to activate cells or inhibit cellular activity, to induce 
cellular growth, proliferation, or differentiation, or to induce the production of other 
factors that, in turn, mediate such activities. 

[010] The cell types having cell surface receptors responsive to secreted 
proteins are various, including, for example, stem cells; progenitor cells; and 
precursor cells and mature cells of the hematopoietic, hepatic, neural, lung, heart, 
thymic, splenic, epithelial, pancreatic, adipose, gastrointestinal, colonic, optic, 
olfactory, bone and musculoskeletal lineages. Further, the hematopoietic cells can be 
red blood cells or white blood cells, including cells of the B lymphocytic (B cell), T 
lymphocytic (T cell), dendritic, megakaryocyte, natural killer (NK), macrophagic, 
eosinophilic, and basophilic lineages. The cell types responsive to secreted proteins 
also include normal cells or cells implicated in disease, disorders, syndromes, or other 
pathological conditions. 

[011] As an example, certain of the secreted proteins of the present 
invention can stimulate T or B cell growth or differentiation by interacting with 
precursor T or B cells or hematopoietic progenitor cells, or bone marrow stem cells. 
As another example, certain secreted proteins of the present invention can maintain 
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stem cells, progenitor cells or precursor cells in an undifferentiated state. As a further 
example, certain secreted proteins of the present invention can regulate bone growth 
by stimulation or inhibition thereof, secretion of insulin, glucose metabolism, cell 
proliferation, response to microbial infection, and regeneration of tissues including 
neural, muscular, and epithelial. Moreover, certain secreted proteins of the present 
invention can induce apoptosis such as in cancer cells or inflammatory cells. 

[0 1 2] Certain of the secreted proteins of the present invention are useful for 
diagnosis, prophylaxis, or treatment of disorders in subjects that are deficient in such 
secreted proteins or require regeneration of certain tissues, the proliferation of which 
is dependent on such secreted proteins, or requires an inhibition or activation of 
growth that is dependent on such secreted proteins. Examples of such disorders 
include cancer, such as bone cancer, brain tumors, breast and ovarian cancer, Burkitt's 
lymphoma, chronic myeloid leukemia, colon cancer, endocrine system cancers, 
gastrointestinal cancers, gynecological cancers, head and neck cancers, leukemia, 
lung cancer, lymphomas, malignant melanoma, metastases, multiple endocrine 
neoplasia, myelomas, neurofibromatosis, pancreatic cancer, pediatric cancers, penile 
cancer, prostate cancer, disorders related to the Ras oncogene, retinoblastoma (RB), 
sarcomas, skin cancers, testicular cancer, thyroid cancer, urinary tract cancers, and 
von Hippel-Lindau syndrome. 

[013] Certain of the secreted proteins herein can be used for diagnosis, 
prophylaxis, and treatment of disorders of hematopoeisis, including thrombosis; 
bleeding; anemias, e.g., iron deficiency and other hypoproliferative anemias, 
megaloblastic anemias, hemolytic anemias, acute blood loss, and aplastic anemia; 
hemoglobinopathies; disorders of granulocytes and monocytes; myelodysplasias and 
related bone marrow failure syndromes; polycythemias, e.g., polycythemia vera; acute 
and chronic myeloid leukemia, and other myeloproliferative diseases, e.g., 
malignancies of lymphoid cells; stimulation of replacement cell growth following 
irradiation or chemotherapy; and plasma cell disorders. 

[014] Certain of the secreted proteins herein can be used for diagnosis, 
prophylaxis, and treatment of disorders of hemostasis, such as disorders of the platelet 
and vessel wall, disorders of coagulation and thrombosis, and anticoagulant, 
fibrinolytic and antiplatelet therapies. 

[015] Certain of the secreted proteins herein can be used for diagnosis, 
prophylaxis, and treatment of disorders of the cardiovascular system including 
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disorders of the heart, such as heart failure; congenital heart disease; rheumatic fever; 
cor pulmonale; cardiomyopathies e.g., myocarditis; pericardial disease; cardiac 
tumors; cardiac manifestations of systemic diseases; and vascular diseases, such as 
acute myocardial infarction, ischemic heart disease, hypertensive vascular disease, 
diseases of the aorta, and vascular diseases of the extremities. 

[0 1 6] Certain of the secreted proteins herein can be used for diagnosis, 
prophylaxis, and treatment of disorders of the respiratory system, such as asthma, 
hypersensitivity pneumonitis, e.g., with pulmonary infiltration, pneumonia, 
necrotizing pulmonary infections, bronchiectasis, cystic fibrosis, chronic bronchitis, 
emphysema and airway obstruction, interstitial lung diseases, primary pulmonary 
hypertension, pulmonary thromboembolism, disorders of the pleura, mediastinum, 
and diaphragm, disorders of ventilation, sleep apnea, and acute respiratory distress 
syndrome. 

[0 1 7] Certain of the secreted proteins herein can be used for diagnosis, 
prophylaxis, and treatment of disorders of the kidney and urinary tract, such as, for 
example, chronic renal failure and glomerulopathies. 

[0 1 8] Certain of the secreted proteins herein can be used for diagnosis, 
prophylaxis, and treatment of disorders of the gastrointestinal system, including 
disorders of the alimentary tract, such as, for example, peptic ulcer disease and related 
disorders, inflammatory bowel disease, irritable bowel syndrome; disorders of the 
liver and biliary tract, such as, for example, hyperbilirubinemias, acute viral hepatitis, 
chronic hepatitis, and cirrhosis; and disorders of the pancreas, such as acute or chronic 
pancreatitis. 

[019] Certain of the secreted proteins herein can be used for diagnosis, 
prophylaxis, and treatment of disorders of the immune system, connective tissue, and 
joints, including, for example, autoimmune diseases, primary immune deficiency 
diseases, human immunodeficiency virus diseases, allergies, systemic lupus 
erythematosus, rheumatoid arthritis, systemic sclerosis, Sjogren's syndrome, 
ankylosing spondylitis, reactive arthritis, vasculitis, sarcoidosis, amyloidosis, 
osteoarthritis, gout, psoriatic, and other arthritis. 

[020] Certain of the secreted proteins herein can be used for diagnosis, 
prophylaxis, and treatment of disorders of the endocrine system, including, for 
example, disorders of the pituitary, hypothalamus, neurohypophysis, thyroid gland, 
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adrenal cortex, testes, ovary, and other organs of the female reproductive system, such 
as breast; as well as pheochromocytoma, diabetes mellitus, and hypoglycemia. 

[021} Certain of the secreted proteins herein can be used for diagnosis, 
prophylaxis, and treatment of disorders of bone and mineral metabolism, and other 
metabolic processes, including, for example, diseases of the parathyroid gland and 
other hyper- and hypocalcemic disorders, osteoporosis, Paget's disease and other 
dysplasia of bone, disorders of lipoprotein metabolism, hemochromatosis, porphyries, 
disorders of purine and pyrimidine metabolism, Wilson's disease, lysosomal storage 
diseases, glycogen storage diseases, lipodystrophies, and other primary disorders of 
adipose tissue. 

[022] Certain of the secreted proteins herein can be used for diagnosis, 
prophylaxis, and treatment of disorders of the central nervous system, including, for 
example, seizures and epilepsy, cerebrovascular diseases, Alzheimer's disease and 
other extrapyramidal disorders, ataxic disorders, amyotrophic lateral sclerosis and 
other motor neuron diseases, disorders of the autonomic nervous system, diseases of 
the spinal cord, including spinal cord injury, primary and metastatic tumors of the 
nervous system, multiple sclerosis, and other demyelinating diseases, as well as 
chronic and recurrent meningitis. 

[023] Certain of the secreted proteins herein can be used for diagnosis, 
prophylaxis, and treatment of disorders of nerves or muscle, including, for example, 
Guillain-Barre Syndrome, myasthenia gravis and other diseases of the neuromuscular 
junction, polymj'ositis, dermatornyositis, muscular dystrophies, and other muscle 
diseases. 

[024] Certain of the secreted proteins herein can be used for diagnosis, 
prophylaxis, and treatment of disorders of the skin, including, for example, eczema, 
psoriasis, cutaneous infections, acne, and other common skin disorders, and 
immunologically mediated skin diseases. 

[025] The agonists or antagonists of the secreted proteins herein or 
fragments thereof can be useful in treating elevated levels of such proteins in ny of the 
disorders above, and including angina, anoxia, arrhythmias, asthma, atherosclerosis, 
benign prostatic hyperplasia, Buerger's Disease, cardiac arrest, cardiogenic shock, 
cerebral trauma, Crohn's Disease, congenital heart disease, mild congestive heart 
failure (CHF), severe congestive heart failure, cerebral ischemia, cerebral infarction, 
cerebral vasospasm, cirrhosis, diabetes, dilated cardiomyopathy, endotoxic shock, 
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gastric mucosal damage, glaucoma, head injury, hemodialysis, hemorrhagic shock, 
hypertension (essential), hypertension (malignant), hypertension (pulmonary), 
hypertension (e.g., pulmonary, after bypass), hypoglycemia, inflammatory arthritis, 
ischemic bowel disease, ischemic disease, male penile erectile dysfunction, malignant 
hemangioendothelioma, myocardial infarction, myocardial ischemia, prenatal 
asphyxia, postoperative cardiac surgery, prostate cancer, preeclampsia, Raynaud's 
Phenomenon, renal failure (acute), renal failure (chronic), renal ischemia, restenosis, 
sepsis syndrome, subarachnoid hemorrhage (acute), surgical operations, status 
epilepticus, stroke (thromboembolic), stroke (hemorrhagic), Takayasu's arteritis, 
ulcerative colitis, uremia after hemodialysis, and uremia before hemodialysis. 

[026] Secreted proteins can be screened for functional activities in 
appropriate functional assays, as is conventional in the art. Such assays include, for 
example, in vitro and in vivo assays for factors that stimulate the proliferation or 
differentiation of stem cells, progenitor cells, or precursor cells into T cells, B cells, 
pancreatic islet cells, bone cells, neuronal cells, etc. 

[027] The tetratricopeptide repeat (TPR) is an example of a protein domain 
characteristic of a protein family, and is present in some of the secreted polypeptides 
of the invention. The TPR family is characterized by a degenerate 34 amino acid 
sequence present in a wide variety of proteins; it mediates protein-protein interactions, 
and is involved in scaffold formation and the assembly of multiprotein complexes 
(http://pfam.wustl.edu/cgi-bin/getdesc?name=TPR). Secreted protein-related 
sequences can also possess or interact with cytochrome P450 domains, which are 
involved in the oxidative degradation of various compounds, including environmental 
toxins and mutagens (littp://pfam.wustl.edu/cgi-bin/getdesc?name=p450). Secreted 
protein-related sequences, e.g., cholesteryl ester transfer protein and phospholipid 
transfer protein, can also possess or interact with the LBP/BPI/CETP domain, which 
is characteristically found in lipid-binding serum glycoproteins (http://pfam.wustl. 
edu/cgi-bin/getdesc?name=LBP_BPI_CETP). Secreted protein-related sequences can 
also possess or interact with peptidase SS domains, also known as subtilase domains, 
which are comprised of serine proteases with a wide range of peptidase activities, 
including exopeptidase, endopeptidase, oligopeptidase, and omega-peptidase activity 
(http://pfam.wustl.edu/cgi-bin/getdesc?name=Peptidase_S8).. Secreted protein- 
related sequences can also possess or interact with adh_short, or short-chain 
dehydrogenase domains, which are found in a large family of proteins, and are made 
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up of short-chain dehydrogenases and reductase enzymes; most family members 
function as NAD- or NADP- dependent oxidoreductases (http://pfam.wustl.edu/cgi- 
bin/getdesc?name=adh_short). 

[028] The inventors herein have identified novel secreted proteins using an 
algorithm that is constructed on the basis of a number of attributes including 
hydrophobicity, two-dimensional structure, prediction of signal sequence cleavage 
site, and other parameters. Based on such algorithm, a sequence that has a secreted 
tree vote of 0.5 - 1 .0, preferably, 0.6 - 1 .0, is believed to be a secreted protein. 
Transmembrane Protein-Related Sequences 

[029] Transmembrane proteins extend into or through the cell membrane's 
lipid bilayer; they can span the membrane once, or more than once. Transmembrane 
proteins that span the membrane once are "single transmembrane proteins" (STM), 
and transmembrane proteins that span the membrane more than once are "multiple 
transmembrane proteins" (MTM). Examples of transmembrane proteins include the 
insulin receptor, adenylate cyclase, and intestinal brush border esterase. 

[030] A single transmembrane protein typically has one transmembrane 

(TM) domain, spanning a series of consecutive amino acid residues, numbered on the 
basis of distance from the N-terminus, with the first amino acid residue at the N- 
terminus as number 1. A multi-transmembrane protein typically has more than one 
TM domain, each spanning a series of consecutive amino acid residues, numbered in 
the same way as the STM protein. 

[031] Transmembrane proteins, having part of their molecules on either 
side of the bilayers, have many and widely variant biological functions. They 
transport molecules, e.g., ions or proteins across membranes, transduce signals across 
membranes, act as receptors, and function as antigens. Transmembrane proteins are 
often involved in cell signaling events; they can comprise signaling molecules, or can 
interact with signaling molecules. For example, tyrosine kinases can be 
transmembrane receptor proteins. Abnormalities of receptor tyrosine kinases are 
associated with human cancers; tumor cells are known to use receptor tyrosine kinases 
in transduction pathways to achieve tumor growth, angiogenesis and metastasis. 
Therefore, receptor tyrosine kinases represent pivotal targets in cancer therapy. It 
would be similarly advantageous to discover novel transmembrane proteins or 
polypeptides, and their corresponding polynucleotides that have additional medical 
utility. 
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[032] The transmembrane polypeptides of the invention, like the secreted 
polypeptides, also have many different functional domains, and belong to a wide 
variety of Pfam families. Transmembrane protein-related sequences can possess or 
interact with immunoglobulin (ig) domains, which are characteristically found in the 
immunoglobulin superfamily, comprised of hundreds of proteins, with various 
functions (http://pfam.wustl.edu/cgi-bin/getdesc?name=ig). Transmembrane protein- 
related sequences can also possess or interact with ion_trans domains, which are 
polypeptides characterized by six transmembrane helices, and which transport ions 
across membranes (http://pfam.wustl.edu/cgi-bin/getdesc?name=ion_trans). Proteins 
in this family can demonstrate specificity for particular ions, e.g., sodium, potassium, 
and calcium. Transmembrane protein-related sequences can also possess or interact 
with integrase core domains, which mediate the integration of a DNA copy of a viral 
genome into a host chromosome; e.g., HIV integrase catalyses the incorporation of 
virally derived DNA into the human genome, presenting a target for the development 
of new therapeutics for the treatment of AIDS (http://pfam.wustl.edu/cgi- 
bin/getdesc?name=rve). Transmembrane protein-related sequences can also possess 
or interact with domains designated as differentially expressed in neoplastic vs. 
normal cells "DENN" domains, which are involved in signal transduction. 
Characteristically, these domains are found in protein components of signaling 
pathways that utilize rab proteins or mitogen-activated protein (MAP) kinases 
(http://pfam.wustl.edu/cgi-bin/getdesc?name=DENN). 

[033] Transmembrane protein-related sequences can also possess or interact 
with acyl coA binding protein (ACBP) domains, which are protein domains that bind 
medium- and long-chain acyl-CoA esters with high affinity (http://pfam.wustl.edu/ 
cgi-bin/getdesc?nanie=ACBP). Membrane-related sequences also possess or interact 
with SPFH domain/band 7 family (Band_7) domain, which are protein domains that 
include a transmembrane segment, and regulate cation conductivity 
(http://pfam.AVUStl.edu/cgi-bin/getdesc?name=Band_7). 

[034] Transmembrane proteins that are differentially expressed on the 
surface of cancer cells, particularly those that are differentially expressed on the 
surface of cancer cells but not on the surface of normal tissues, such as heart and lung, 
are desirable targets for production of antibodies, e.g., diagnostic antibodies or 
therapeutic antibodies, such as antibodies that mediate ADCC or CDC to effect tumor 
cell killing. 
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[035] Transmembrane proteins with extracellular fragments that can be 
cleaved can be useful as secreted proteins to effect ligand/receptor binding so as to 
mediate intracellular responses, such as signal transduction. Transmembrane proteins 
that act as receptors, and possess a ligand binding extracellular portion exposed on a 
cell surface and an intracellular portion that interacts with other cellular components 
upon activation can be also be useful as transmembrane proteins to mediate 
intracellular responses, such as signal transduction. 

Kinase-Related Sequences 

[036] A kinase is an enzyme that catalyzes the transfer of phosphate groups 

from phosphate donors to acceptor substrates. Kinase substrates include, but are not 
limited to, proteins and lipids. Sequences of the invention that phosphorylate protein 
substrates are designated "Pkinases." Examples of kinase-related sequences include 
calcium, calmodulin-dependent protein kinase II, myosin light chain kinase, and 

phosphatidlyinositol kinase. 

[037] Kinases and phosphatases are counteracting: kinases add phosphate 
groups and phosphatases liberate phosphate groups. The counteracting activities of 
kinases and phosphatases provide cells with a "switch" that can turn on or turn off the 
function of various proteins. The activity of any protein regulated by phosphorylation 
depends on die balance, at any given time, between the activities of the kinase(s) that 
phosphorylate it, and the phosphatase(s) that dephosphorylate it. Phosphorylation 
plays a important role in intercellular communication during development, 
homeostasis, and the function of major bodily systems, including the immune system. 

[038] In conjunction with phosphatases, kinases control such diverse and 
essential cellular processes as transcription, cell division, cell cycle progression, 
differentiation, cytoskeletal function, apoptosis, receptor function, learning and 
memory, hematopoeisis, fertilization, neural transmission, muscle contraction, non- 
muscle motor function, glycogen metabolism, and hormone secretion. 

[039] Most kinases act within a network of kinases and other signaling 
effectors, and are modulated by autophosphorylation and phosphorylation by other 
kinases (Manning et al., 2002). Intracellular signaling involves a multitude of diverse 
mechanisms that combine to modulate the activity of individual proteins in response 

to different biological inputs. 

[040] Defects in cell signal transduction pathways are responsible for a 
number of disorders, including the majority of cancers, immune disorders, and many 
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inflammatory conditions, including, but not limited to, Crohn's disease (Geffen and 
Man, 2002; Van Den Blink et al., 2002;. Lodish 1999). Over-expression and/or 
structural alteration of kinases, for example, receptor tyrosine kinase family members, 
is often associated with human cancers. For example, tumor cells are known to use 
receptor tyrosine kinases in transduction pathways to achieve tumor growth, 
angiogenesis and metastasis. Therefore, receptor tyrosine kinases represent pivotal 
targets in cancer therapy. A number of small molecule receptor tyrosine kinase 
inhibitors have been synthesized, are in clinical trials, are being analyzed in animal 
models, or have been marketed. Inhibitory mechanisms include ligand-dependent 
down regulation, e.g., by the adaptor Cbl (Brunelleschi et aL, 2002). 

[041] Kinase-related sequences can possess or interact with protein kinase 
(pkinase) domains, which share a conserved catalytic core common in 
serine/threonine and tyrosine protein kinases (http://pfam.wustl.edu/cgi- 
bin/getdesc?name=pkinase). Kinase-related sequences can also possess or interact 
with A-kinase anchoring protein 95 (AKAP95) domains, which comprise two zinc 
fingers, and have been implicated in chromosome condensation (http://pfam.wustl. 
edu/cgi-bin/getdesc?name=AKAP95). Kinase-related sequences can also possess or 
interact with inositol 1,3,4,-trisphosphate 5/6 kinase (Insl34_P3Jcin) domains, which 
mediate the function of inositol 1.3.4-trisphospha.te, a branch point in inositol 
phosphate metabolism (http://pfam.wustl.edu/cgi-bin/getdesc?name= Insl34_P3_kin). 

[042] Kinases, by virtue of their participation in many and varied 
intracellular activities, are useful as targets of therapeutic intervention such as, for 
example, in cancer and inflammation. Cells transfected with cDNA encoding a kinase 
can be used in screening for small molecule agonists or antagonists, for example. 
Ligase-Related Sequences 

[043] Ligases are enzymes that join together, or ligate, two molecules. 
Ligase substrates include nucleic acids and proteins. For example, DNA ligases link 
two DNA molecules together; they play a role in DNA repair and replication. DNA 
ligases also are involved in the rearrangement of immunoglobulin gene segments, 
such as those responsible for the generation of antibody diversity. Examples of 
protein ligases include ubiquitin protein ligases, which add an ubiquitin molecule to 
an amino acid residue, typically as part of a peptide or polypeptide. Examples of 
nucleic acid ligases include DNA ligase I, DNA ligase III alpha, and T4 RNA 
ligase 2. 
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[044] Ligases are also involved in cellular regulatory processes. For 
example, glutamate-cysteine ligase (GCL) is the first and rate-limiting enzyme 
involved in the biosynthesis of glutathione. Polymorphisms of human GCL account 
for differences in sensitivity to environmental toxicants and chemotherapeutic agents 
in human cancer cell lines (Walsh et al., 2001). Also by way of example, glutamate- 
ammonia ligase, or glutamine synthetase (GS), is expressed at a higher than normal 
level in human primary liver cancer, and may be involved in hepatocyte 
transformation (Christa et al., 1994). 

[045] Ligase-related sequences can possess or interact with ATP dependent 
DNA ligase (DNAJigase) domains, which can join two DNA fragments by 
catalyzing the formation of an internucleotide ester bond between a phosphate and a 
deoxyribose (http://pfam.wustl. edu/cgi-bin/getdesc?name= DNAJigase). Ligase- 
related sequences can also possess or interact with glutamate-cysteine ligase (GCS) 
domains, which, catalyze the rate-limiting step in the biosynthesis of glutathione. 
(http://pfam.wustl.edu/cgi-bin/getdesc?name=GCS). Ligase-related sequences can 
also possess or interact with 2',5'RNA ligase (2_5_ligase) domains, which ligate 
tRNA half molecules containing 2 ',3 '-cyclic phosphate and 5'hydroxyl terminal to 
products containing a 2'5'phosphodiester linkage (http://pfam.wustl.edu/cgi- 
bin/getdesc?name=2_5_ligase). 

[046] Like kinases, ligases are also useful as targets for identification of 
agonists and antagonists, such as small molecule drugs. 

Receptor-Related Sequences (Including Nuclear Hormone and T-Cell Receptors) 

[047] A receptor is a polypeptide that binds to a specific signaling 

molecule and initiates a cellular response. Receptors can be present on the cell 
surface or inside the cell. Example of receptor types include G-protein-linked 
receptors, ion channel-linked receptors, enzyme-linked receptors, T-cell receptors, 
thyroid hormone receptors, retinoid receptors, nuclear hormone receptors, and the 
related category of steroid hormone receptors, e.g., Cortisol receptors (Alberts et al., 
1994). 

[048] G-protein-linked receptors transduce extracellular signals into 

intracellular responses by interacting with guanine nucleotide binding proteins. The 
same ligand can activate many different G-protein-linked receptors. G-protein-linked 
receptors mediate cellular responses to a diverse range of signaling molecules, 
including hormones, neurotransmitters, and local mediators, which are varied in 
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structure and function, and encompass proteins and small peptides, as well as amino 
acids and their derivatives, and fatty acids and their derivatives. Many signaling 
molecules are active at low concentrations, and their receptors often bind with high 
affinity. Examples of G-protein-linked receptors include, but are not limited to, 
rhodopsins, olfactory receptors, and p-adrenergic receptors. 

[049] Ion channel-linked receptors are involved in synaptic signaling. 

These receptors regulate ion channels, to which they are linked. Some respond to 
signals from neurotransmitters, e.g., acetylcholine, serotonin, GABA, and glycine. A 
common mechanism of action for ion channel-linked receptors is to transiently open 
or close their respective ion channel, transiently changing the permeability of the 
membrane in which they reside to a specific ion or ions. 

[050] Enzyme-linked receptors can be linked to enzymes or can 

function as enzymes. Their ligand binding site is commonly on one side of the 
membrane, e.g., an extracellular domain, and the catalytic site is on the other, e.g., a 
cytoplasmic domain. Transmembrane tyrosine-specific protein kinase receptors for 
growth and differentiation factors are enzyme-linked receptors; examples include 
receptors for epidermal growth factor (EGF), platelet-derived growth factor (PDGF), 
fibroblast growth factors (FGFs), hepatocyte growth factors (HGF), insulin, insulin 
like growth factor-1 (IGF-1), nerve growth factor (NGF), vascular endothelial growth 
factor (VEGF), and macrophage colony stimulating factor (M-CSF). 

[05 1 ] Nuclear hormone receptors generally function by crossing the 

plasma membrane of target cells and binding to intracellular protein ligands. Ligand 
binding activates these receptors in some instances, exposing a DNA binding domain 
which regulates the transcription of specific genes. Generally, nuclear hormone 
receptors bind to specific DNA sequences adjacent to or in the vicinity of the genes 
regulated by their ligand. A host of cell type-specific regulatory proteins can 
collaborate with the nuclear hormone receptor to influence the transcription of 
specific genes or sets of genes (Alberts et aL, 1994). Examples of nuclear hormone 
receptors include estrogen-related receptors, such as hERRl , which modulates the 
estrogen receptor-mediated response of the lactoferrin gene promoter (Yang et al., 
1996), and is a transcriptional regulator of the human medium chain acyl coenzyme A 
dehydrogenase gene (Sladek et aL, 1997). Examples of nuclear hormone receptors 
also include photoreceptor-specific nuclear receptors, such as NR2E3, which are part 
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of a .arge family of nuclear receptor transcription factors involved in signaling 
pathways. NR2E3 plays a role in eone function and human retinal photoreceptor 
differentiation and degeneration (Milam et al., 2002; Kobayashi et a!., 1999). 

[052] T-cell receptors are membrane proteins compnsed of two 

disulfide-linxed polypeptide chains, each win, .wo innnunoglobuundike domains 
They display a similarity to antibodies in «ha, they have a variable aminc-tenntnal 
region and a constant carboxyl-terminal region which is coded for by variable, 
joining, and constant region genes (Wei « al., .997; Alberts e, al., 1994). 
Rearrangement of T-cell receptor genes have been associated with homan T-eell 

leukemias (Fisch et al., 1993). 

, 053] Receptors are involved in eellnlar processes mat regulate growth 
and differentiation. Their dysregulation can lend to hyperproliferative conditions, and 
«hey are common therapeutic targets. For example, the EOF receptor is aberrantly 
activated in neoplasia, especially in tumors of epithelial origin. EOF receptor 
antagonists can successfully treat some of these tumors, either *<™ 
combination with chemotherapy or ionizing radiation (Karietal., 2003). The 
progesterone receptor, an intracellular steroid hormone receptor, plays a role m the 
development and function of the mammary gland, the uterus, and the ovary. Mutation 
or aberrant expression of the progesterone receptor, or its regulatory molecules, can 
affect its normal function and lead to cancer (Gao and Nawaz, 2002). 

r 054] Receptors are also involved in cellular processes that regulate 

inflammation and immunify. For example, members of the type 1 interleulcm-, 
raptor family mediate immune and inflammatory responses, and functron m host 
defense. (OUeill, 2002). Their activation can lead to the activation of srgnalmg 
cascades e g., pathways involving transcription factors and protein kinases, resulting 
'' in an inflammatory response (O^eill, 2002). Another mechanism by which receptors 
regulate inflammation and immunity is by their setective expression, at dtscrete stages 
of differentiation, by cells involved in the inflammatory response. For example 
expression of the triggering receptor expressed on myeloid cons (TREM-1) and the 
myeloid DAH2-asscciating lectin (MDL-1) are correlated with myelomonocyric 
differentiation. These receptors are more highly expressed in differentiated cells, are 
involved in monocyte activation and tire inflammatory response, and am expressed at 
a lower level in malignant compared to normal cells (Oingras et al., 2002). 
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[055] Receptor-related sequences can possess or interact with seven 

transmembrane receptor (7tm_l) domains, which are protein domains with a 
structural framework comprising seven transmembrane helices found in receptors, 
e.g., receptors in the rhodopsin family with a wide range of functions, activated by 
ligands that vary widely in structure and character (http://pfam.wustl.edu/cgi- 
bin/getdesc?name=7tm_l). Receptor-related sequences can also possess or interact 
with LI transposable element (transposase_22) domains, some of which have been 
characterized to exhibit reverse transcriptase activity, and some of which are capable 
of retrotransposition. Receptor-related sequences can also possess or interact with a 
SH2 domain, which is a protein domain of about 100 amino acid residues found in 
many intracellular signal-transducing proteins, that can regulate intracellular signaling 
cascades by interacting with phosphotyrosine-containing target peptides in a 
sequence-specific and phosphorylation-dependent manner (http://pfam.wustl.edu/cgi- 
bin/getdesc?name=SH2). Receptor-related sequences can also possess or interact 
with LDL receptor domains, e.g., theiow-density lipoprotein receptor repeat class B 
(Ldl_recept_b) domain, which comprises a conserved YWTD motif in multiple 
tandem repeats (http://pfam.wustl.edu/ cgi-bin/getdesc?name=ldl_recept_b). 
Receptor-related sequences can also possess or interact with ribosomal L10 
(Ribosomal_L10e) domains, which are protein domains commonly found in the large 
ribosomal subunit (http://pfam.wustl.edu/cgi-bin/getdesc?name=Ribosomal_L10e). 

[056] Receptor-related sequences can possess or interact with zinc 

finger C4 type domains, which are DNA binding domains of nuclear hormone 
receptors that share a conserved cysteine-rich region of approximately 65 amino acids 
and regulate such diverse biological processes as pattern formation, cellular 
differentiation, and homeostasis (http://www.sanger.ac.uk/cgi-bin/Pfam/getacc? 
PF00105). Receptor-related sequences can also possess or interact with a ligand 
binding domain of nuclear hormone receptors (hormone_rec), which are helical 
domains involved in the regulation of eukaryotic gene expression, cellular 
proliferation, and differentiation in target tissues (http://www.sanger.ac.uk/cgi- 
bin/Pfam/getacc?PF00104). Receptor-related sequences can also possess or interact 
with Mov34 domains, which are regulatory subunits of the proteasome found in some 
regulators of transcription factors Oittp://www.sanger.ac.uk/cgi-bin/Pfam/getacc? 
PF01398). Receptor-related sequences can also possess or interact with 
immunoglobulin domains, which are described above. 
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[057] Receptors, and fragments of receptors can be used as 

therapeutics. For example, a ligand-binding portion, an effector-binding portion, and 
a kinase or phosphatase domain or consensus sequence can comprise fragments that 
can function as agonists or antagonists enhance or reduce, e.g., ligand binding to the 
natural receptors, or effector function by the natural receptors. 
Phosphatase-Related Sequences 

[058] A phosphatase, as indicated above, is an enzyme that catalyses the 
hydrolysis of esters of phosphoric acid. Its substrates include, but are not limited to, 
nucleic acids, proteins, and lipids. Together with kinases, phosphatases are active in a 
broad range of cellular functions, including transcription, cell division, cell-cycle 
progression, intermediate cellular metabolism, glycogen metabolism, lipogenesis and 
lipolysis, maintenance of electrochemical gradients, neuronal function, immune 
responses, intracellular vesicular transport, cytoskeletal function, sperm motility, and 
skeletal, cardiac, and smooth muscle function (Oliver and Shenolikar, 1998). 

[059] Disruption in these functions may lead to disorders. For example, as 
noted above, phosphatases regulate pathways of cell growth and programmed cell 
death; disruptions in these pathways can lead to abnormal cell growth, such as that 
which occurs in cancer. Mutations in serine/threonine protein phosphatase 2 A 
(PP2A), a multifunctional regulator of cell growth and function, are associated with 
the increased growth of tumor cells (Schonthal, 2001). The tumor suppressor 
"phosphatase and tensin-homology deleted on chromosome 10" (PTEN) gene encodes 
PIP 3 , a lipid phosphatase that dephosphorylates phosphatidly inositol, thus countering 
the action of the oncogenes PI 3 -kinase and Akt, which promote cell survival. PTEN 
has been identified as a tumor suppressor; it is deleted in multiple types of advanced 
human cancers. 

[060] Also as noted above, phosphatases regulate pathways that control 
immune function. For example, the CD45 phosphotyrosine phosphatase is one of the 
most abundant glycoproteins expressed on immune cells, and regulates T-cell 
signaling and development (Alexander, 2000). In addition, the serine/threonine 
phosphatase calcineurin plays a central role in lymphocyte activation, among other 
important and wide-ranging cellular functions (Baksh and Burakoff, 2000). Certain 
compounds, specifically, cyclosporine and FK-506 (Tacrolimus), have been found to 
inhibit the phosphatase activity of calcineurin, thereby suppressing the production of 
IL-2 and other cytokines. In addition, these compounds have recently been found to 
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block the JNK and p38 signaling pathways triggered by antigen recognition in T-cells. 
Finally, phosphatase inhibitors have proven to be valuable as immune suppressant 
drugs, and those in the field believe that modulators of phosphatase activity promise 
to be important immunoregulatory compounds (Allison, 2000). 

[06 1 ] Phosphatase-related sequences can possess or interact with protein 
phosphatase 2C (PP2C) domains, which display Mn 44 or Mg** dependent protein 
serine/threonine phosphatase activity (http://pfam.wustl.edu/cgi-bin/getdesc? 
name=PP2C). Phosphatase-related sequences can also possess or interact with 
protein-tyrosine phosphatase (Yjphosphatase) domains, which catalyze the removal 
of a phosphate group attached to a tyrosine residue (http://pfam.wustl.edu/cgi- 
bin/getdesc?name=Y_phosphatase). Phosphatase-related sequences can also possess 
or interact with protein phosphatase inhibitor l/DARPP-32 (DARPP-32) domains, 
which inhibit protein phosphatases, and play a role in regulating neurotransmitter 
pathways, receptors, and ion channels (http://pfam.wustl.edu/cgi-bin/getdesc? 
name=DARPP-32). 

[062] Like kinases, phosphatases can be used as targets for therapeutic 
intervention, in cell-free or cell-based assays, for example, in screening for drugs, 
including small molecule drugs. 
Protease-Related Sequences 

[063] Proteases, also known as endopeptidases, are enzymes that cleave 

polypeptide chains by hydrolyzing peptide bonds at positions within the amino acid 
chain. Different proteases recognize different polypeptide sequences. Endopeptidase 
substrate specificities vary from broad to narrow; for example, subtilisins are 
relatively non-specific, and can cleave polypeptide chains with a wide variety of 
amino acid sequences, whereas thrombin is more specific and can only cleave 
polypeptide chains with an arginine residue on the carboxyl side of the susceptible 
peptide bond and glycine on the amino side. Additional examples of protease-related 
sequences include collagenases, trypsin, and damage-induced neuronal endopeptidase 
(Kiryu-Seo et al., 2000). 

[064] Proteases mediate the continuous remodeling of living tissues. For 
example, the extracellular matrix, a tissue skeleton that mediates communication 
among cells, and influences the structure and function of associated tissues and 
organs, is continuously remodeled. A strictly controlled balance is maintained 
between breakdown of the extracellular matrix by proteases and reconstruction of the 
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extracellular matrix. This continued matrix remodeling is a dynamic process that 
shapes the structure and function of tissues and organs (Wojtowicz-Praga, 1999). 

[065] . Defects in protease function are responsible for a number of 
disorders, including cancer and other hyperproliferative disorders. Proteases are 
involved in the pathogenesis of such disorders both by virtue of their involvement in 
programmed cell death and tumor invasion and metastasis (Los et al., 2003; Stetler- 
Stevenson et aL, 1993). Detection of the presence or characteristics of proteases can 
be used to screen for and diagnose prostate cancer (Karanazanashvili and 
Abrahamsson, 2003). Proteases are also involved in the pathogenesis of 
inflammatory and arthritic diseases, such as pancreatitis, osteoarthritis, and 
rheumatoid arthritis (Pfutzer and Whitcomb, 2001; Martel-Pelleteir et al., 2001; Lerch 
and Gorelick, 2000). 

[066] Protease-related sequences possess or interact with a variety of 
different protease domains, including domains belonging to the cysteine protease 
family, the serine protease family, and the metalloproteinase family 
(http://pfam.wustl.edu/cgi-bin/text search?terms=endopeptidase&search_what== 
all&sections =DE&sections=CC&size=l 0). 
Phosphodiesterase-Related Sequences 

[067] Phosphodiesterases are enzymes that cleave phosphodiester 

bonds, i.e., bonds formed by two hydroxyl groups in an ester linkage to the same 
phosphate group, such as those between adjacent RNA or DNA nucleotides. 
Phosphodiesterases are found in both soluble and membrane-associated forms. Most 
phosphodiesterases act within a network of signal transduction molecules and other 
signaling effectors, and are modulated by components of these pathways. 
Phosphodiesterases regulate the metabolism and synthesis of cyclic nucleotides in 
signal-transduction pathways. They hydrolyze cAMP and cGMP, molecules that play 
an important and widespread role in signal transduction. Phosphodiesterases also 
repair damage to nucleic acids. Some phosphodiesterases are regulated primarily by 
calcium and calmodulin, others are regulated primarily by cGMP. They differ in their 
sensitivity to individual inhibitors, but all share a homologous catalytic region (Siegel, 
etal., 1999). 

[068] Examples of phosphodiesterases include nucleotide 

p}Tophosphatases (NPP) and plasma membrane glycoprotein PC-1, which are present 
in elevated levels in the fibroblasts of patients with Lowe's syndrome (Funakoshi et 
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aL, 1992). Another example of a phosphodiesterase is myomegalin-like protein, 
which is expressed at high levels in the nucleus and cytoplasm of heart and skeletal 
muscle (Soejima et aL, 2001). Phosphodiesterases have demonstrated promise in 
cancer chemotherapy, analgesia, the treatment of Parkinson's disease, and the 
treatment of learning and memory disorders (Weishaar, et aL, 1985). 

[069] Phosphodiesterase-related sequences can possess or interact with 

type I phosphodiesterase/nucleotide pyrophosphatase (phosphodiest) domains, which 
catalyze the cleavage of phosphodiester and phosphosulfate bonds 
(http://ww.sanger.ac.uk/cgi-bin/Pfam/getacc7PF0 Phosphodiesterase-related 
sequences can also possess or interact with 3 '5 '-cyclic nucleotide phosphodiesterase 
(PDEase) domains, which are involved in signal transduction (http://www.sanger.ac. 
uk/cgi-bin/Pfam/getacc?PF00233). 

[070] Phosphodiesterases (PDEs) are also useful as targets for 

therapeutic intervention, for example, for identification of agonists or antagonists, 
such as in the screening of small molecule inhibitors. A well known PDE-5 inhibitor, 
sildenafil citrate (Viagra®) is used for treatment of erectile dysfunction (Brock, 
2000). The mechanism of action involves inhibition of PDE-5 enzyme and resulting 
increase in cyclic guanosine monophosphate (cGMP) and smooth muscle relaxation in 
the penis (Rosen and McKenna, 2002). Such inhibitors may also find use for 
treatment of severe pulmonary arterial hypertension. (Ghofrani et aL, 2003). 
XCinesin-Related Sequences 

[071] Cells transport proteins and organelles in an orderly and 

regulated manner along cytoskeletal filaments. Molecular motor proteins, such as 
kinesins, can carry such cargo along the cytoskeletal filaments to specific 
destinations, in a highly regulated manner. Exemplary membrane-bound cargoes 
include mitochondria, lysosomes, endoplasmic reticulum, and axonal vesicles (Vale, 
2003). Kinesins also transport nonmembranous cargo, such as mRNAs, tubulin 
monomers, and intermediate filaments (Vale, 2003). 

[072] Kinesins, e.g., KIF 1 1 , function in the cell division process (Miki 

et aL, 2001). In the nucleus, kinesins are necessary to establish spindle bipolarity, 
position chromosomes on metaphase plates, and maintain forces in the spindle. 
Several members of the kinesin family are associated with the chromosomes, and are 
likely to perform a role in mitotic chromosome movement (Miki et aL, 2001). For 
example, the C-terminal kinesin KIFC1 is involved in the processes of meiosis, 
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mitosis, and karyogamy (Miki et al., 2001). The kinesin GAKIN binds to the human 
analog of the Drosophila Discs Large tumor suppressor protein (hDlg), a membrane 
associated guanylate kinase (Hanada, 2000). GAKIN undergoes translocation in T- 
lymphocytes upon their cellular activation (Hanada, 2000). The GAKIN/hDlg 
complex is also hypothesized to play a role in cell division (Hanada, 2000). Thus, the 
kinesin GAKIN plays a role in cell proliferation and T-cell mediated immune 
function. 

[073] Kinesin-mediated intracellular transport is also implicated in as a 

mechanism of tumorigenesis. For example, kinesin transports the tumor suppressor 
adenomatous polyposis colon protein (APC) (Jimbo et al., 2002). The APC gene is 
mutated in both sporadic and familial colorectal tumors. The APC protein interacts 
with the microtubule plus-end-directed kinesin proteins KIF3 A and KIF3B through an 
association with the kinesin superfamily-associated protein 3 (KAP3). Normally, the 
APC tumor suppressor is transported to its correct intracellular location at the tips of 
membrane protrusions. Mutant APCs derived from cancer cells, however, are unable 
to undergo kinesin-mediated transport, and do not accumulate with normal efficiency 
in clusters in the membrane protrusions, and thereby can not function efficiently as 
tumor suppressors. 

[074] In view of the connection to cancer, investigators have sought 

small molecules to inhibit specific molecular motors in cells, such as the mitotic 
kinesin Eg5/Ksp (Mayer, 1999). In addition, others have found small molecule 
inhibitors of Eg5/Kap with low nanomolar affinity have anti-tumor activity, and one 
such agent has entered clinical phase I trials (Vale, 2003). 

[075] In another arena, it has been proposed that impairing motor- 

driven delivery of MHC peptide complexes to the surface of dendritic cells could 
provide immunomodulation. Additionally, inhibiting the cell surface delivery of 
cytotoxic granules in T cells could help provide immunosuppressive therapy (Vale, 
2003). 

[076] Kinesin-related sequences can possess or interact with kinesin 

motor (kinesin) domains, which hydrolyze ATP and bind to microtubules to produce a 
motor-active force that transports intracellular vesicles and organelles 
(http://pfam.wxistLedu/cgi-biii/getdesc?name=]dnesin). Kinesin-related sequences can 
also possess or interact with kinesin-associated protein (KAP) domains, which are 
non-motive domains that form a complex with kinesin (http://pfam.wustl.edu/cgi- 
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bin/getdesc?name=KAP). Kinesin-related sequences can also possess or interact with 
MyTH4 domains, which are present in the tail of the motor ATPase proteins kinesin 
and myosin (http://pfam.wustl.edu/cgi-bin/getdesc?name=MyTH4). 

[077] Kinesins, like kinases, are useful as targets for therapeutic 

intervention, for example, in screening for small molecule inhibitors for the treatment 
of cancer. 

Immunoglobulin-Related Sequences 

[078] An immunoglobulin is an antibody molecule, and is typically 
composed of heavy and light chains, each of which have constant regions that display 
similarity with other immunoglobulin molecules and variable regions that convey 
specificity to particular antigens. Most immunoglobulins can be assigned to classes, 
e.g., IgG, IgM, IgA, IgE, and IgD, based on antigenic determinants in the heavy chain 
constant region; each class plays a different role in the immune response. 

[079] Immunoglobulins are characterized by a structural motif, the 
immunoglobulin (Ig) domain, which is approximately one hundred amino acids long, 
is involved in protein-protein and protein-ligand interactions, and includes a 
conserved intradomain disulfide bond (http://pfam.wustl.edu/cgi-bin/getdesc? 
name=ig). It is one of the most common domains found among all known proteins, 
and is present in hundreds of proteins with diverse functions. Proteins with the ig 
domain comprise the immunoglobulin superfamily; members include antibodies, T- 
cell receptors, major histocomptability proteins, the CD4, CDS, and CD28 co- 
receptors, most of the invariant polypeptide chains associated with B and T cell 
receptors, leukocyte F c receptors, the giant muscle kinase titin, and receptor tyrosine 
kinases (Janeway et aL, 2001; Alberts, et aL, 1994). 

[080] Polypeptides with immunoglobulin-like domains can be markers for 
specific types of tissues and tumors. For example, a 43-kDa protein membrane 
antigen with two immunoglobulin-like domains in its extracellular region is expressed 
in normal human colonic and small bowel epithelium and > 95% of human colon 
cancers, but absent from most other human tissues and tumor types (Heath et aL, 
1997). 

[08 1 ] Polypeptides with immunoglobulin-like domains are also involved in 
inflammation. For example, myelin oligodendrocyte glycoprotein, a myelin-specific 
protein found in the central nervous system, specifically binds to and activates 
complement, an effector of the immune system, via its extracellular immunoglobulin- 
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like domain. By virtue of providing the means for an interaction between myelin and 
the complement component of the immune response, myelin oligodendrocyte 
glycoprotein is a modulator of central nervous system inflammation and has been 
predicted by those in the field to be relevant to the pathogenesis of demyelinating 
diseases such as multiple sclerosis (Johns and Barnard, 1997). 

[082] Immunoglobulin-related sequences can also possess or interact with 
leucine-rich repeat domains, which are involved in protein-protein interactions, and 
are used in molecular recognition processes as diverse as signal transduction, cell 
adhesion, cell development, DNA repair and RNA processing 
(http://pfam.wustl.edu/cgi-bin/getdesc7name =LRRNT). Immunoglobulin-related 
sequences can also possess or interact with fibronectin type III repeat (fh3) domains. 
(http://pfam.wustl.edu/cgi-bin/getdesc?name=fh3), which contain binding sites for 
DNA and heparin. Immunoglobulin-related sequences can also possess or interact 
with WASp Homology domain 1 (WH1), which can bind the metabotropic glutamate 
receptors mGluRlalpha and mGluR5 (http://pfam.wustl.edu/cgi-bin/getdesc? 
name=WHl). 

Glycosylphosphatidylinositol Anchor-Related Sequences 

[083] Glycosylphosphatidylinositol (GPI) anchor proteins are 

synthesized as single membrane proteins; the transmembrane segment is cleaved 
away in the endoplasmic reticulum, where a GPI membrane anchor is added. The 
resulting protein is bound to die non-cytoplasmic, i.e., either extracellular or luminal, 
side of the membrane by the GPI anchor. GPI anchor proteins can be dissociated 
from the membrane by phosphatidylinositol-inositol-specific phospholipase C 
(Alberts et al., 1994). Examples of GPI-anchor proteins include prefoldin, a 
chaperone that delivers unfolded proteins to cytosolic chaperonin (Vainberg et al., 
1998), and carboxypeptidase M, which is associated with the differentiation of 
monocytes to macrophages (Rehli et al., 1995). 

[084] GPI anchor protein-related sequences can possess or interact with 

KE2 domains, which may contain a DNA binding leucine zipper motif(http://www. 
sanger.ac.uk /cgi-bin/Pfam/getacc?PF01 920). GPI anchor protein-related sequences 
can also possess or interact with zinc carboxypeptidase (Zn_carbOpept) domains, 
which include carboxypeptidase H regulatory domains and carboxypeptidase A 
digestive domains (http://www.sanger.ac.uk/cgi-bin^fam/getacc?PF00246). 
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Other Polypeptide-Related Sequences 
Activator-Related Sequences 

[085] An activator is a molecule or collection of molecules that 

positively modulates the activity of a regulatory protein, or that binds to DNA and 
regulates one or more genes by increasing the rate of transcription. Regulatory 
protein activators contribute to an increase in protein activity. Transcriptional 
activators provide a positive control over gene transcription; for example, they can 
sense the internal condition of the cell and bind to a sequence of DNA near a target 
promoter, resulting in the transcription of an appropriate gene. Examples of activator- 
related sequences include template-activating factors, bacterial catabolite activators, 
and the coenzyme thiamine pyrophosphatase. Activator-related sequences, e.g., 
factors that influence viral replication and transcription, can be encoded by oncogenes 
(Nagata et al., 1995). 

[086] Activator-related sequences can possess or interact with SH2 

domains, which are protein domains of about 100 amino acid residues found in many 
signal-transducing proteins. SH2 domains can regulate signaling cascades, e.g., by 
interacting with phosphotyrosine-containing target peptides in a sequence-specific and 
phosphorylation-dependent manner (http://pfam.wustl.edu/cgi-bin/getdesc? 
name=SH2). Activator-related sequences also possess or interact with nucleosome 
assembly protein (NAP) domains, which regulate gene expression, and are accessible 
to histones (http://pfam.wustl.edu/cgi- bin/getdesc?name=NAP). 
Adaptor-Related Sequences 

[087] Adaptors are proteins involved in the process of capturing 

specific cargo molecules into membrane-bound vesicles for transport through the cell. 
Different adaptors recognize different receptors for cargo molecules, and also 
recognize different vesicle coat proteins, accounting, in part, for the specificity of the 
content of intracellular vesicles bound to specific destinations within the cell (Kirsch 
et al., 1999). Examples of adaptor-related sequences include adaptins, clathrins, 
adaptor-related protein complex subunits, and Cas ligand with multiple Src homology 
3 domains (CMS) adaptors. 

[088] Adaptor-related sequences can possess or interact with src 

homology 3 (SH3) domains, which are small protein modules of approximately 50 
amino acid residues found in a variety of intracellular or membrane-associated 
proteins. SH3 domains are often indicative of a protein involved in signal 
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transduction events related to cytoskeletal organization, (http://pfam.wustl.edu/cgi- 
bin/getdesc?name=SH3). Adaptor-related sequences also possess or interact with the 
adaptin N-terminal (AdaptinJST) protein domain, which is found in the N terminal 
region of various adaptor protein complexes. The N-terminal region of adaptor 
proteins is relatively constant in comparison to the C-terminal (http://pfam.wustl. 
edii/cgi-bin/getdesc?name=Adaptin_N). 

Adhesion Molecule-Related Sequences 

[089] , Adhesion molecules are molecules that mediate the adhesion of 

cells with other cells, and with the extracellular matrix. Examples of adhesion 
molecules include members of the immunoglobulin superfamily, integrins, cadherins, 
selectins, and transmembrane proteoglycans. The adhesion molecule 
carcinoembryonic antigen (CEA) is present nearly exclusively on cancer cells, and is 
expressed on the cell surface of approximately 80% of all solid cancerous tumors 
(Berinstein et al., 2002). 

[090] Adhesion molecule-related sequences can possess or interact with 

the immunoglobulin (ig) domain, which are described above. Adhesion molecule- 
related sequences can also possess or interact with integrin alpha cytoplasmic region 
(integrin_A) domains, which comprise the short, intracellular region of the integrin 
alpha chain http://pfam.wiistl.edu/cgi-bin/getdesc?name^tegrin_A). 

Antigen-Related Sequences 

[091] An antigen is a molecule that provokes an immune response; they 
include both foreign antigens and autoantigens. Antigens can be expressed in a 
tissue-specific manner and their expression can be developmentally regulated. For 
example, the heat stable antigen HSA is expressed in both a tissue-specific maimer, 
i.e., it is restricted to hematopoeitic cells, and a developmentally-regulated manner, 
i.e., it is more highly expressed in immature precursor cells than in terminally 
differentiated cells (Wenger et aL, 1993). Antigens can be expressed on the cell 
surface or inside the cell, e.g., in the nucleus or on intermediate filaments. Antigen- 
related sequences include sequences related to tumor antigens, which are expressed 
exclusively in tumor cells, or in greater amounts in tumor cells than in normal cells. 
Tumor antigens can be transmembrane proteins, with one or more transmembrane 
domains (Li et al., 1996; Linnenbach, et al., 1993). 

[092] Autoantigens, which are components of the body that provoke an 
immune response, are involved in the pathogenesis of autoimmune disease. 
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Autoantigens can be either selectively or ubiquitously expressed among cell and 
tissue types. They can be localized to any region of the cell, including the nucleus, 
nucleolus, nuclear envelope, and intermediate filaments (Racevskis et al., 1996). For 
example, pancreatic islet cell antigens are involved in the autoimmune pathogenesis 
of diabetes, and thyroid antigens are involved in autoimmune thyroid disease. 

[093] Antigen-related sequences can possess or interact with the ICAp69 
domain, which is characterized by a 69 kDa pancreatic islet cell autoantigen present in 
autoimmune (insulin-dependent) diabetes mellitus (http://pfam.wustl.edu/cgi- 
bin/getdesc?name=ICA69). Antigen-related sequences can also possess or interact 
with the Ku70/Ku80 C-terminal arm (Ku_C) or Ku70/Ku80 N-terminal alpha/beta 
(Ku_N) domains, which belong to the Ku family of peptides (http://pfam.wustl. 
edu/cgi-bin/getdesc?name=Ku_C; http://pfam.wustl. edu/cgi-bin/getdesc? 
name=Ku_N). Ku, an antigen associated with autoimmune disease, normally 
functions to bind DNA double-strand breaks and facilitate DNA repair, but induces 
autoimmunity under pathological conditions. Antigen-related sequences can also 
possess or interact with the bZIP transcription factor (bZIP) domain, which comprises 
a basic region and a leucine zipper region (http://pfam.wustl.edu/cgi-bin/getdesc? 
name=bZIP). Antigen-related sequences can possess or interact with YT521-B-like 
(YTH) domains, which comprise YT521-B, a tyrosine-phosphorylated nuclear protein 
domain that modulates alternative RNA splice site selection, and interacts with other 
nuclear proteins, e.g., scaffold attachment factor B, and Sam68, a 68-kDa substrate 
associated with Src during mitosis (http://pfam.wustl.edu/ cgi-bin/getdesc?name= 
YTH). 

ATPase-Related Sequences 

[094] ATPases are enzymes that use the energy of ATP hydrolysis to 

move ions or small molecules across a membrane against a chemical concentration 
gradient or electrical potential. For example, ATPases can maintain low intracellular 
calcium and sodium ion concentrations, and generate a low pH inside lysosomes, 
plant-cell vacuoles, and the lumen of the stomach. Vacuolar ATPases are ATP- 
dependent proton pumps that create pH gradients by transporting protons across 
membranes, while coupling the energy produced in the conversion of ATP to ADP 
with proton transport (Forgac, 1999). They can acidify or alkalinize cells, organelles, 
and extracellular compartments, and create voltage gradients that drive the secretion 
or absorption of ions and fluids (Wieczorek et al. 1999). Examples of ATPase-related 
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sequences include proton transporters, glucose transporters, multidrug resistance 
factors, calcium ATPases, and porins. 

[095]. ATPase-related sequences can possess or interact with ATP 

synthase F/14-kDa subunit (ATP-synt-F) domains, which correspond to a 14-kDa 
subunit in the peripheral catalytic part of vacuolar ATPases (http://pfam.wustl.edu/ 
cgi-bin/getdesc?name=ATP-synt_F). ATPase-related sequences can also possess or 
interact with vacuolar (BT^-ATPase C, D, G, and H subunit (V-ATPase) domains, 
which are membranes-attached sequences that generate an acidic environment 
(http://pfam.wustl.edu/cgi-biii/getdesc?name=V-ATPase_G). 

ATP-Related Sequences 

[096] Adenosine trisphosphate (ATP) is a nucleotide comprising an 

adenine, a ribose, and a trisphosphate unit. The trisphosphate unit contains two 
phosphoanhydride bonds that confer an energy-rich property to ATP. The free energy 
liberated in the hydrolysis of one or both of these bonds can drive reactions that 
require an input of free energy. A wide range of physiological and pathological 
processes are driven by the energy of ATP, including cellular movement, the 
synthesis of biomolecules from precursors, muscle contraction, ciliary and flagellar 
function, intermediary metabolism, glycolysis, fatty acid oxidation, oxidative 
phosphorylation, and membrane transport (Ku et al., 1990). Examples of ATP-related 
sequences include ATPases, ATP synthases, ATP carrier proteins, and myosin. 

[097] ATP-related sequences can possess or interact with ATP- 

synthase subunit C protein domains (ATP-synt_C), which are protein domains that 
consist of two long terminal hydrophobic regions, and are implicated in the proton- 
conducting activity of ATPases (http://pfam.wustl.edu/cgi-bin/getdesc?name=ATP- 
synt C). ATP-related sequences can also possess or interact with mitochondrial 
carrier protein (mito_carr) domains, which are involved in energy transfer across the 
inner mitochondrial membrane (http://pfam. wustl.edu/cgi-bin/getdesc? name= 
mito_carr). 

Binding Protein-Related Sequences 

[098] A binding protein is a protein that binds to another molecule with 
specificity. Binding proteins can be involved in building macromolecular structures, 
e.g., in cytoskeletal assembly or scaffolding (Machesky et al., 1997). Proteins often 
exist in the cell in complexes with other proteins, nucleic acids, lipids, and/or small 
molecules. For example, steroid receptors, e.g., the progestin, estrogen, androgen, 
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and glucocorticoid receptors, bind to heat-shock proteins and FKBP52, a calcium- 
regulated immunosuppressant, to form functional complexes (Peattie et al., 1992; 
Sanchez et al., 1990). DNA binding proteins and general transcription factors bind to 
the TATA box, a consensus sequence in a gene's promoter region that specifies the 
position of transcription initiation, forming a functional transcription complex (Chalut 
etal., 1995). Proteins can interact with multiple molecules simultaneously. For 
example, Nedd4, an ubiquitin-protein ligase, can interact with multiple proteins and 
lipids through its lipid binding domain and multiple protein binding domains (Jolliffe 
etal., 2000). 

[099] Proteins utilize a large number of motifs to bind other molecules. 
Binding protein-related sequences can possess or interact with the cold-shock DNA- 
binding (CSD) domain, a conserved domain of about 70 amino acids that helps the 
cell survive in temperatures below optimum growth temperature by inducing the 
synthesis of proteins that negatively regulate transcription, translation, and 
recombination, resulting in suppressed cell proliferation (http://pfam.wustl.edu/cgi- 
bin/getdesc?name=CSD). Proteins induced by exposure to cold include DNA-binding 
proteins, and cold inducible RNA binding proteins, which have RNA binding 
domains at or near their N-termini (Nishiyama et al., 1997). For example, contrin, a 
testis-specific DNA/RNA binding protein with a cold shock domain also has a large 
number of phosphorylation sites, each of which can mediate intermolecular 
interactions (Tekur et al., 1999). Contrin is involved in transcription of testis-specific 
genes; its inactivation could provide a reversible male contraceptive. 

[01 00] Binding protein-related sequences can possess or interact with the 
ARID/BRIGHT DNA binding (ARID) domain, which is an approximately 100 amino 
acid sequence involved in a wide range of DNA interactions, including, but not 
limited to, interaction with AT-rich regions (http://pfam.\vustl.edu/cgi-bin/getdesc? 
name=ARID). ARID-encoding genes are involved in a variety of biological 
processes, including regulation of cell growth, development, cell lineage gene 
regulation, cell cycle control, and tissue-specific gene expression. 

[0101] Binding protein-related sequences can also possess or interact with 
nucleosomal binding domains to facilitate binding within the nucleosome, a nuclear 
structure comprised of chromosomal DNA and proteins. For example, the HMG14 
and HMG17 (HMG14_17) domain is present in some nucleosome proteins, most 
commonly, in proteins HMG14 and HMG17, members of a family designated as high 
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mobility group proteins, which form components of chromatin, and bind to 
nucleosomal DNA, regulating the interaction of the DNA with histone proteins 
(http://pfam.wusu. edu/cgi-bin/getdesc? name=HMG14__17). 

[01 02] Binding protein-related sequences can also possess or interact with 
conserved motifs that recognize RNA, and allow the protein to bind RNA 
(http://pfam. wustledu/cgi-bin/textsearch?terms=rna+ binding&search__what= 
all&sections=DE&sections =CC&size=100). These motifs include the RNA 
recognition (rrm) domain, also known as a RRM, RBD, or RNP domain (http://pfam. 
wustLedu/cgi-bin/getdesc?name=rrm). Numerous RNA binding proteins possess the 
rrm domain, including heterogeneous nuclear ribonucleoproteins (hnRNP) proteins, 
which are implicated in the regulation of alternative splicing, and LA proteins, which 
are among the main autoantigens in systemic lupus erythematosus (SLE). 

[0103] Binding protein-related sequences can also possess or interact with 
conserved motifs that mediate their binding to ions, e.g., calcium. Calcium-binding 
proteins such as calmodulin, the calcineurins, and their homologues and related 
proteins are widely used to regulate cellular processes (http://pfam.wustl.edu/cgi- 
bin/textsearch?terms=calcium +binding& search_what=all&sections=DE&sections= 
CC&size=100). Ion-binding proteins include phosphoproteins that bind to other 
molecules in an manner dependent on their phosphorylation state, and can regulate 
many types of molecules and processes, including those that utilize complex signaling 
cascades (Pang et aL, 2001; Pang et al., 2002; Lin et al., 1999). Ion-binding protein- 
related sequences can possess or interact with the EF hand (efhand) domain, a 
calcium-binding domain that comprises a loop of twelve amino acids that coordinates 
a calcium ion in a pentagonal bipyramidal configuration and is flanked on both sides 
by a twelve amino acid alpha-helical domain (http://pfam.wustl. edu/cgi-bin/getdesc? 

name=efhand). 

Breakpoint-Related Sequences 

[0104] A breakpoint is the location on a chromosome where a gene is 
disrupted, and one segment of the gene is severed from the other. Chromosomal 
breaks that disrupt coding or regulatory sequences can result in gene mutation. 
Chromosomal breaks can also serve as molecular landmarks, e.g., a break can be 
detected on Southern blots as the loss of an expected band and the appearance of two 
novel bands. Examples of breakpoint-related sequences include the sequences that 
generate the Philadelphia chromosome translocation, the sequences that generate the 
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chromosome translocation (t(l;7)(q42;pl5)), which is implicated in Wilms' tumor, 
and the sequences that generate the chromosomal translocation t(18;21)(q22.1q21.3), 
which is implicated in Down syndrome. 

[01 05] Breakpoints commonly occur in discrete regions of the 
chromosome. Breakage at these regions can lead to a recognized disease phenotype.. 
One way of generating such a phenotype is by chromosomal translocation, i.e., 
chromosomes mutate by exchanging parts. When a segment from one chromosome is 
exchanged with a segment from another nonhomologous chromosome, two mutated 
chromosomes are simultaneously generated (Griffiths, et al., 1999). The Philadelphia 
chromosome, a mutation sometimes associated with chronic myelogenous leukemia 
(CML), is an example. It results from the translocation of a discrete segment of 
chromosome 22 into a discrete region of chromosome 9. Patients with the 
Philadelphia chromosome mutation generally have a better prognosis than CML 
patients with other characteristics. 

[0106] Acquired clonal chromosomal abnormalities are found in the 
malignant cells of most patients with leukemia, lymphoma, and solid tumors. Some 
of these abnormalities are the result of consistent chromosomal rearrangements. For 
example, in a preponderant number of chronic myelogenous leukemia cases, 
breakpoints at chromosome band 22ql 1 occur within a breakpoint cluster region of 5- 
6 kb (Weinstein et al., 1 988). 

[0107] Chromosome rearrangements affecting band 3q21 are associated 
with a particularly poor prognosis in myeloid leukemia or myelodysplasia. These 
breakpoints cluster in a breakpoint cluster region of approximately 30 kb, located 
centromeric and downstream of the ribophorin I (RPN-I) gene (Weiser, 2002). The 
apoptotic gene bcl-2, was isolated as a breakpoint rearrangement in human follicular 
lymphomas and was shown to act as an oncogene that promoted cell survival rather 
than cell proliferation. 

[0 1 08] Some proteins can act as leukemia or lymphoma-specific 
antigens for major histocompatibility complex-restricted T cell cytotoxicity. These 
include the breakpoint cluster region (bcr)-abl, and other fusion oncoproteins. 
Genetically engineered chimeric and humanized antibodies have demonstrated 
activity against overt lymphomas and leukemias. Radioimmunotherapy has produced 
significant therapeutic responses with minimal radiation exposure to normal tissues 
(Jurcic et al., 2000). 
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[0109] Breakpoint-related sequences can possess or interact with 
RhoGAP domains, also known as the breakpoint cluster region-homology domain, 
and mediatessignal transduction by small G proteins (http://pfam.wustl.edu/cgi- 
bin/getdesc?name=RhoGAP). Breakpoint-related sequences can also possess or 
interact with RhoGEF domains, which comprise approximately 200 amino acid 
residues that encode a guanine nucleotide exchange factor (http://pfam.wustl.edu/cgi- 
bin/getdesc?name=RhoGEF). Breakpoint-related sequences can also possess or 
interact with Plectin/S10 (S10_plectin) domains, which are found at meN-terminus of 
some isoforms of plectin and ribosomal S10 protein (http://pfam.wustl.edu/cgi- 

bin/getdesc?name=S 1 0 jplectin). 

Carrier or Transport-Related Sequences 

[0110] A membrane transport protein is an integral transmembrane protein 
that aids one or more molecules across a cell membrane. Most, if not all, types of 
molecules are transported across membranes, including proteins, ions, and fatty acids 
(Schaffer and Lodish, 1 994). Even molecules such as water and urea, which can 
diffuse across pure phospholipid bilayers, are frequently accelerated by transport 
proteins. Transporters clear cells of toxins, and confer drug resistance on tumor lines 
(Ramalho-Santos et al., 2002). The rate of transport varies considerably among 
membrane transport proteins. Membrane transport proteins function in the plasma 
membrane and in intracellular organellar membranes, including the nuclear, 
mitochondrial, lysosomal, and vesicular membranes. For example, transportin, also 
known as karyopherin beta2, imports nuclear mRNA binding proteins from the 
cytoplasm across the nuclear membrane, into the nucleus (Bonifaci et al., 1997). 

[0111] Membrane transport proteins can have either a broad or a narrow 
range of specificity for the transported substance. In mammalian cells, nucleoside 
transport across membranes is mediated by broad specificity transporters. Nucleoside 
transport plays a role in such diverse cellular functions as nucleotide synthesis, 
neurotransmission, and platelet aggregation. Nucleoside transporters carry 
chemotherapeutic nucleosides, and are a target of interest in chemotherapeutic and 
cardiac drug design (Griffiths et al., 1997; Ku et al., 1990). 

[01 12] Carriers are another class of membrane transport proteins; they bind 
to a solute and transport it across the membrane by undergoing a series of 
conformational changes. In contrast to channel proteins, transporters bind only one, 
or a few, substrate molecules at a time; after binding substrate molecules, they 
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undergo a conformational change such that the bound substrate molecules, and only 
those molecules, are transported across the membrane. Carriers transport a wide 
variety of molecules, including fatty acids across the plasma membrane (Schaffer and 
Lodish, 1994); purines, pyrrolidines, and components of nucleosides across the 
nuclear membrane, and adenine nucleotides across the inner mitochondrial membrane 
(BattinietaL, 1997). 

[0113] Membrane transport-related sequences can possess or interact with 
vacuolar (H*)-ATPase C, D, G, and H subunit (V-ATPase) domains, which are 
membrane-attached sequences that generate an acidic environment 
(http://pfam.wustl.edu/cgi-bin/getdesc? name=V-ATPase_C). Membrane transport- 
related sequences can also possess or interact with nucleoside transporter 
(nucleoside tran) domains, which are found in proteins that transport nucleosides 
across the plasma membrane, and are employed to synthesize nucleotides via the 
salvage pathways in cells that lack their own de novo synthesis pathways 
(http://pfam.wustl.edu/cgi-bin/getdesc?name=Nucleoside_tran). Membrane transport- 
related sequences can also possess or interact with ATP synthase F/14-kDa subunit 
(ATP-synt-F) domains, which correspond to a 14-kDa subunit in the peripheral 
catalytic part of vacuolar ATPases (ht^>://pfam.wustl.edu/cgi-bin/getdesc? 
name=ATP-synt_F). Membrane transport-related sequences can also possess or 
interact with mitochondrial carrier protein (mito_carr) domains, which are involved in 
energy transfer across the inner mitochondrial membrane (http://pfam.wustl.edu/cgi- 
bin/getdesc?name=mito_carr). Membrane transport-related sequences can also 
possess or interact with an AMP-binding enzyme (AMP-binding) domain, which is a 
domain rich in serine, threonine, and glycine, and is characterized by a conserved 
proline-lysine-glycine triplet sequence (http ://pfam. wustl.edu/ cgi- 
bin/getdesc?name=AMP-binding). 

[01 14] Membrane transport proteins, such as those expressed in cancer cells, 
are useful as targets for therapeutic intervention, for example, in the screening for 
small molecule inhibitors. Inhibition of membrane transport, as indicated above, may 
make cancer cells more susceptible to chemotherapy, for example. 

Channel-Related Sequences 

[0115] Channel proteins transport water or specific types of ions down 
their concentration or electrical potential gradients. They form a protein-lined 
passageway across the membrane through which multiple water molecules or ions 
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move at a very rapid rate, e.g., up to 10 per second. The plasma membrane, for 
example, contains potassium-specific channel proteins that generate the cell's resting 
electric potential across the plasma membrane. Examples of channel-related 
sequences include the sodium hydrogen exchanger, sodium potassium ATPase, and 
the cystic fibrosis transmembrane regulator. 

[0116] Members of this subset of membrane transport proteins have 
wide-ranging functions in both normal physiology and in pathology. For example, the 
transport system that mediates the transmembrane exchange of sodium for hydrogen 
across the plasma membrane plays a physiological role in the regulation of 
intracellular pH, the control of cell growth and proliferation, stimulus-response 
coupling, metabolic responses to hormones, the regulation of cell volume, and the 
transepithelial absorption and secretion of several ions. The sodium-hydrogen 
exchanger also plays a role in cancer and in tissue and organ hypertrophy 
(Mahnensmith and Aronson, 1 985). 

[0117] Channel-related sequences can possess or interact with 
sodium/hydrogen exchanger (Na_H_Exchanger) domains, which exchange sodium 
for hydrogen across a membrane in an electroneutral manner (http://pfam.wustl. 
edu/cgi-bin/getdesc? name=Na_H_Exchanger). Channel-related sequences can also 
possess or interact with neurotransmitter-gated ion-channel ligand binding 
(Neur_chan_LBD) domains, which form the extracellular domains of some ion 
channels (http://pfam.wustl.edu/cgi-bin/getdesc?name=Neur_chan_LBD). Channel- 
related sequences can also possess or interact with UBX domains, which are present 
in ubiquitin-regulatory proteins (littp://pfam.wustl.edu/ cgi-bin/getdesc?name=UBX). 

Checkpoint-Related Sequences 

[0118] The cell division cycle is the fundamental means by which living 
things are propagated. Fundamental to successful propagation is the faithful 
replication of DNA; a cell cycle control system exists to coordinate the cycle as a 
whole. The control system is regulated by brakes that can stop the cycle at specific 
checkpoints. Thus, the checkpoints arrest the cycle upon the occurrence of 
undesirable events, such as DNA damage, replication stress, or mitotic spindle 
disruption. For example, DNA lesions and disrupted replication forks are recognized 
by the DNA damage checkpoint and replication checkpoint, respectively. 
Checkpoints can also, for example, initiate protein kinase-based signal transduction 
cascades to activate downstream effectors that elicit cell cycle arrest, DNA repair, or 
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apoptosis. These actions prevent the conversion of aberrant DNA structures into 
inheritable mutations and minimize the survival of cells with unrepairable damage 
(Qin and Li, 2003). 

[0119] Dysregulation of the cell-cycle is a hallmark of tumor cells. 
Defective checkpoint function results in genetic modifications that contribute to 
tumorigenesis. Checkpoint function can be abrogated by many different mechanisms 
.. (Bast, et al., 2000). For example, cyclin-dependent kinases that normally are 
activated at a checkpoint can be inactivated or activated in an abnormal manner. 
Alternatively, the normal activities of the cyclin-dependent kinase inhibitors, 
phosphatases, or other regulatory molecules of the cell cycle can be altered. Tumor 
suppressors are among the classes of molecules that can effect cell cycle 
dysregulation. The abrogation of checkpoint function can alter the sensitivity of 
tumor cells to chemotherapeutics (Stewart et al, 2003). 

[0 1 20] Checkpoint-related sequences can possess or interact with 
phosphoribosi'laminoimidazole-succinocarboxamide synthase (SAICAR_synt) 
domains, which function in de novo purine synthesis (http://pfam.wustl.edu/cgi- 
bin/getdesc?name =SAICAR_synt). Checkpoint-related sequences can also possess 
or interact with WD40 domains, which comprise a domain of approximately 40 amino 
acids, which are sometimes present in tandem repeats (http://pfam^vustl.edu/cgi- 
bin/getdesc?name=WD40). Checkpoint-related sequences can also possess or interact 
with cyclin, C-terminal (cyclin C) domains, which regulate cyclin dependent kinases 
(http://pfam.wustl.edu/cgi-bin/getdesc? name=cyclin_C). 

[0121] Thus, checkpoint related proteins, e.g., kinases, phosphatases, 
etc., are useful as targets for therapeutic intervention, such as in screening for small 
molecule drugs for the treatment of cancer, immune disorders, and inflammation. 

Complex-Related Sequences 

[0 1 22] Complexes are molecular entities comprised of two or more 
components. Molecular complexes within cells form functional units that carry out 
cellular operations. For example, complexes at the cell membrane perform structural 
and regulatory tasks, including regulating membrane traffic and maintaining organelle 
integrity. Complexes at the cytoskeleton perform static and dynamic roles with 
respect to cell shape, intracellular transport, and communication with the extracellular 
matrix. Complexes in the nucleus transcribe and regulate genes, and complexes at 
sites of protein synthesis translate and regulate proteins. Complexes can reside 
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intracellular^ and/or extracellularly, e.g., in the extracellular matrix. Examples of 
complex-related sequences include cytoskeletal and filamentous proteins, ADP- 
ribosylation factor (ARF) proteins, and protein synthesis initiation factors (Amor et 
al., 1994). 

[01 23] Complex-related sequences can possess or interact with ADP- 
ribosylation factor family (arf) domains, which are GTP-binding domains involved in 
protein trafficking (ht^://pfani.wustl.edu/cgi-bin/getdesc?name-arf). Complex- 
related sequences can also possess or interact with eukaryotic initiation factor 
domains, e.g., the eukaryotic initiation factor 4E (IF4E) domain, which recognizes 
and binds mRNA during protein synthesis (http://pfam.wustl.edu/cgi-bin/getdesc? 
name=IF4E). Complex-related sequences can also possess or interact with 
intermediate filament (filament) protein domains, which form filamentous structures 
typically 8 to 14 nm wide, and form components of the cytoskeleton and nuclear 
envelope, e.g., neurofilaments, cytokeratins, lamins, vimentin, and desmin 
(http://pfam.wustl.edu/cgi-bin/getdesc?name=filament). 
Cytokine-Related Sequences 

[0124] A cytokine is an extracellular signaling protein or peptide that acts as 
a local mediator in communication among cells. Cytokines regulate proliferation and 
differentiation, for example, they mediate differentiation of cells in the hematopoeitic 
lineage. Examples of cytokines include interleukins, interferons, and colony 
stimulating factors of the hematopoeitic system. Some cytokines, e.g., interferons and 
interleukins, can be induced by viral activity, and possess antiviral activity (Sheppard 
et al., 2003). Cytokine-related sequences may enable the expression of a cytokine, for 
example, as a cytokine transcription factor (Kao et al., 1994). They can also be part 
of a cytokine effector pathway, for example, as an intracellular effector of cytokine- 
related cytoskeletal changes in response to events in the extracellular matrix (Hirsh et 
al., 2001; Joberty et al., 1999). 

[01 25] Cytokine-related sequences can possess or interact with interferon- 
induced transmembrane protein (CD225) domains, which are associated with 
interferon-induced cell growth suppression (http://pfam.wustl.edu/ cgi- 
bin/getdesc?name=CD225). Cytokine-related sequences can also possess or interact 
with SelR (SelR) domains, which bind both selenium and zinc, and/or methionine 
sulfoxide reductase enzymatic domains (http://pfam. wustl.edu/cgi- 
bin/getdesc?name=SelR). Cytokine-related sequences can also possess or interact 
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with reverse transcriptase (rvt) domains, which are involved in RNA-directed E>NA 
polymerase activity, an enzymatic activity that uses an RNA template to produce 
DNA for integration into a host genome (http://pfam.wustl.edu/cgi-bin/getdesc? 
name=rvt). Cytokine-related sequences can also possess or interact with LI 
transposable element domains (Transposase_22), which are described above. 

[0 1 26] Cytokines, thus, are useful as therapeutic proteins for the treatment of 
disorders such as cancer, immune disorders, and inflammation. 

Dehydrogenase-Related Sequences 

[0127] Dehydrogenases are enzymes that catalyze the removal of 
hydrogen atoms in the absence of oxygen. They contribute to a wide range of 
enzymatic reactions, including those involved in amino acid degradation, amino acid 
synthesis, the citric acid cycle, fatty acid oxidation, fatty acid synthesis, glycolysis, 
the pentose phosphate pathway, photosynthesis, pyruvate oxidation, and oxidative 
phosphorylation (Walker et al., 1992). Examples of dehydrogenases include steroid 
dehydrogenases, NADH dehydrogenases, and glyceraldehyde-3 -phosphate 
dehydrogenase. 

[0128] Dehydrogenase-related sequences can possess or interact with 
glyceraldehyde 3-phosphate dehydrogenase, NAD binding (GPDH) domains, which 
play a role in glycolysis and gluconeogenesis by reversibly catalyzing the oxidation 
and phosphorylation of D-glyceraldehyde-3 -phosphate to 1 ,3-diphospho-glycerate 
(http://pfam.wustl.edu/cgi-bin/getdesc?name=gpdh). Dehydrogenase-related 
sequences can also possess or interact with 3-hydroxyacyl-CoA dehydrogenase, NAD 
binding (3HCDH_N) domains, which catalyze the reduction of 3-hydroxyacyl-CoA to 
3-oxoacyl-CoA in fatty acid metabolism (http://pfam.wustl.edu/cgi-bin/getdesc? 
name=3HCDH_N). 

Disease-Related Sequences 

Amyotrophic Lateral Sclerosis 

[0 1 29] Amyotrophic Lateral Sclerosis (Lou Gehrig's Disease) is a 
neurodegenerative disease that affects the motor neurons. The disease displays 
multiple clinical variants and can affect motor neurons throughout the nervous 
system, e.g., the spinal cord and brainstem. One clinical variant, the autosomal 
recessive form of juvenile amyotrophic lateral sclerosis, has been mapped to the 
human chromosome 2q33-q34 region (Hadano et al., 2001). A protein family 
characterized by the HAP1 N-terminal conserved region (HAP1_N) domain possesses 
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a N-terminal conserved region from hypothetical protein products of ALS2CR3 genes 
found in the 2q33-2q34 region of chromosome 2 (http://pfam.wustl.edu/cgi- 
bin/getdesc?name=HAPl__N). 
Gaucher 's Disease 

[0130] Gaueher's Disease is a genetic disease characterized by a deficiency 
of enzymes responsible for the breakdown and recycling of glycolipids, i.e., lipids 
with carbohydrate moieties, e.g., glucosylceramide; and sphingolipids, lipids with 
sphingosine.moieties,, e.g., sphingomyelin. Normally, the glycolipids and 
sphingolipids in the membranes of senescent cells are metabolized by a multi-step 
process that includes the activities of acid beta-glucosidases and saposins. When 
these activities are absent, or present in reduced amounts, glucosylceramide and 
sphingolipids accumulate, and produce the Gaucher's disease phenotype. The disease 
displays multiple clinical variants, and can manifest with central nervous system 
pathology, enlargement of organs, e.g., liver and spleen, and an increase in the level 
of the cytokine transforming growth factor beta (Zhao and Grabowski, 2002; Perez 
Calvo et al., 2000; Cormand et al., 1997). The variability in clinical presentation is 
consistent with the large number of different mutations observed in the acid beta- 
glucosidase and saposin genes. 

[0131] Acid beta-glucosidases are enzymes that metabolize glycolipids. 
Saposins are small proteins that are described in more detail below. Mammalian 
saposins are synthesized as a single precursor molecule (prosaposin) with saposin-A 
(SAP A) and saposin-B (SapB_l ; SapB_2) domains; prosaposin becomes an active 
saposin following a proteolytic activation reaction (http://pfam.wustl.edu/cgi- 
bin/getdesc?name=SAPA; http://pfam.wustl. edu/cgi-bin/getdesc?name=SapB_l ; 
http://pfam.wustl.edu/cgi-bin/getdesc?name=SapB_l). 

Huntington Disease 

[0132] Huntington Disease is a progressive neurodegenerative genetic 
disorder characterized by dementia, psychiatric symptoms, and a choriform 
movement disorder. It is caused by an increased number of repeats of the codon 
CAG, which encodes the amino acid glutamine, in a gene located at the 4pl6.3 region 
of chromosome 4, which codes for a protein called huntingtin. The polyglutamine 
tracts expressed by the mutant form of the gene selectively ablate striatal and cortical 
neurons, (Ho et al., 2001). 
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[0133] The Huntington Disease gene is widely expressed; but exerts tissue- 
specific effects on neurons (Lin et al., 1993). The gene expresses multiple distinct 
transcripts, and differential polyadenylation of the gene leads to the expression of 
transcripts of different sizes (Lin et al., 1993). There is a relative increase in the 
abundance of one transcript in the human brain, which has been hypothesized to 
account for the tissue-specific effects of the disease (Lin et al., 1993). The HAP1_N 
protein domain, described above, binds to the gene product, huntingtin, in a 
polyglutamine repeat-length-dependent manner (http://pfam.wustl.edu/cgi- 
bin/getdesc?name=HAPl_N). This domain is also found in several huntingtin- 
associated protein 1 (HAP1) homologues. 

Multiple Sclerosis (MS) 

[01 34] Multiple sclerosis (MS) is a disease characterized by demyelination, 
i.e., the loss of the myelin coating, of nerve axons. Its clinical course varies among 
patients; these variations fall into two broad categories, a relapsing/remitting course, 
and a chronic progressive course. MS has a complex etiology; it has an autoimmune 
component, is influenced by genetics, and sometimes involves infectious agents. MS 
results from an abnormal immune response to one or more antigens present in the 
myelin sheaths that cover the nerve axons of genetically susceptible individuals, 
which may be preceded by exposure to a causal infectious agent (Oksenberg et al., 
1999). 

[0135] The genetic susceptibility to MS is determined by MS susceptibility 
genes, most of which demonstrate only a small to moderate effect on susceptibility, 
e.g., the major histocompatibility complex at chromosome 6p21 (Oksenberg et al., 
1999). An etiological infectious agent has been isolated from the plasma and 
cerebrospinal fluid of patients with multiple sclerosis (Perron et al., 1997). This agent 
is a retroviral oncovirus, known as multiple sclerosis-associated retrovirus (MSRV), 
also called LM7, and is found in association with virions produced by the cultured 
cells of MS patients (Perron et al., 1997). MSRV proteins possess protein domains 
characteristic of retroviral proteins. These include the Gag P30 core shell protein 
(Gag_p30) domain, which is involved in viral assembly (http://pfam.wustl.edu/cgi- 
bin/getdesc?name=Gag_p30) and the reverse transcriptase (rvt) domain, which was 
described above. 
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Obesity 

[0136] Although single-gene mutations have been shown to cause obesity in 
animal models, the most common forms of human obesity arise from the interactions 
of multiple genes, environmental factors, and behavior. Several genes have been 
shown to affect body weight regulation in humans and other animals. These include 
the ob, lep, CPE, ASIP, LEP, TUB, UPC, POMC, CCKAR, TNFA, and PPAR-y 
genes (Comuzzie et aL, 1998). Genetic regulation of body weight can be effected 
through diverse mechanisms. For example, the TUB gene family regulates body 
weight by encoding proteins that are phosphorylated in response to insulin, mediate 
insulin signaling, and are associated with a maturity onset obesity associated with 
insulin resistance (Ikeda et aL, 2002). CCKAR genes regulate body weight in a 
different manner; they regulate the hormone cholecystokinin, which produces a 
feeling of satiety following food intake (Fitter et aL, 1994). 

[0137] Some genes that regulate body weight possess the WH1 domain, 
which is described above. Genes that regulate body weight can also possess or 
interact with the sprouty (sprouty) domain. This domain is found in sprouty proteins, 
which inhibit the Ras/niitogen-activated protein kinase cascade, a pathway initiated 
by receptor tyrosine kinases and involved in development (http://pfam.wustl.edu/cgi- 
bin/getdesc?name=Sprouty). Genes that regulate body weight can also possess or 
interact with a Tub (Tub) domain, which is found in Tubby, a mouse gene in which an 
autosomal recessive mutation resulting from a splicing defect causes maturity-onset 
obesity, insulin resistance and sensory deficits (http://pfam.wustl.edu/cgi- 
bin/getdesc?name=Tub). 

Oncogene 

[0138] An oncogene is any one of a large number of genes that can help 
make a cell cancerous. Typically, an oncogene is a mutant form of a normal gene, 
and is often a gene involved in the control of cell growth, division, or differentiation. 
Cells in higher organisms normally grow, divide, differentiate, and die under the 
regulation of other cells. Cancer cells proliferate, in part, because they are able to 
divide without input from other cells, as the result of accumulated mutations. 
Oncogenes include, but are not limited to, genes encoding GTP binding proteins, e.g., 
ras; growth factors, e.g., platelet-derived growth factor; growth factor receptors, e.g., 
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platelet-derived growth factor receptor; kinases, e.g., src; nuclear proteins, e.g., myc; 
and tumor suppressors, e.g., retinoblastoma proteins. 

£0139] The products of oncogenes are frequently proteins involved in cell 
signaling, e.g., kinases, GTP-binding proteins, and receptors. For example, many 
human cancers have a mutation in a ras gene (Alberts et ai., 1994). The ras proteins 
belong to a large superfamily of monomeric GTPases, and relay signals from receptor 
tyrosine kinases to the nucleus, stimulating cell proliferation or differentiation. Ras 
proteins function as switches, cycling between an active state in which GTP is bound, 
and an inactive state, in which GDP is bound. A ras gene mutation can result in the 
translation of a protein that fails to hydrolyze its bound GTP, and persists abnormally 
in its active state, transmitting an intracellular signal for cell proliferation or 
differentiation even in the presence of regulatory non-proliferation and non- 
differentiation signals. Oncogene-related proteins can possess one of many ras 
protein domains (http://pfam.wustl.edu/cgi-b^ 

what=all&sections=DE &sections=CC&size=100), including the sub-families Ras ? 
Rab, Rac, Ral, Ran, Rap, and Yptl. Oncogene-related proteins can also possess a 
Gtrl/RagA G-protein conserved region (gtrl_RagA) domain, which is found in some 
G-proteins of the Ras family, e.g., the RagA/B human homologues of the ras GTP 
binding protein Gtrl (http://pfam.wustl.edu/cgi-bin/getdesc?name=Gtrl_RagA). 
Oncogene-related sequences can also possess or interact with an ATPase domain 
associated with diverse cellular activities; proteins with the AAA (ATPases 
Associated with diverse cellular Activities) domain can perform chaperone-like 
functions that assist in assembling, operating, or disassembling protein complexes. 
The domain includes a conserved region of approximately 220 amino acids that 
contains an ATP-binding site which can act as an ATP-dependent protein clamp to 
hold a protein in place (http://pfam.wustl.edu/cgi-bin/getdesc?name=AAA). Some 
oncogene-related sequences can also possess or interact with a C2 domain of 
approximately 1 16 amino-acid residues, which can be involved in calcium-dependent 
phospholipid binding and inositol- 1,3, 4,5 -tetraphosphate binding, and is found, e.g., 
in some isozymes of protein kinase C (http://pfam.wustl.edu/cgi- 
bin/getdesc?name=C2). C2 domains are typically located between CI domains 
(which bind phorbol esters and diacylglycerol) and protein kinase catalytic domains. 
Regions with homology to the C2 domain are present in many proteins, e.g., 
synaptotagmin. 
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Parkinson 's Disease 
[0140] Parkinson's disease is a neurological disorder that affects movement 
control. Complex interactions among groups of nerve cells in the central nervous 
system coordinate to control movement. One such group of neurons is located in the 
substantia nigra of the midbrain; these neurons release the neurotransmitter dopamine, 
which allows an organism to fine-tune its movements. In Parkinson's disease, neurons 

of the substantia nigra progressively degenerate, leaving the patient with clinical 

« 

symptoms that may include resting tremor, muscular rigidity, a slowness of 
spontaneous movement, and poor balance and motor coordination (Seigel et aL, 
1999). 

[0141] Parkinson's disease has multiple causes, including both genes and the 
environment. It also has multiple presentations, including juvenile-onset (before age 
45) and adult onset (after age 45), and can be transmitted through either autosomal 
dominant or autosomal recessive mechanisms. In keeping with the diversity of 
etiologies, presentation, and genetic mechanisms, there are a large and diverse number 
of genes and gene products involved in the pathogenesis of Parkinson's disease. For 
example, the PARK2 gene, which encodes the protein parkin, is mutant in autosomal 
recessive juvenile parkinsonism. PARK2 is a ubiquitin protein ligase that is a 
component in the pathway that attaches ubiquitin to specific proteins, designating 
them for degradation (Fishman, and Oyler, 2002). 

[0142] Parkinson's disease-related sequences can possess or interact with 
synuclein domains, which are expressed on the cytoplasmic regions of proteins found 
predominantly in neurons (http://pfam.wustl.edu/cgi-bin/getdesc?name=Synuclein). 
Alpha-synuclein, which possesses a synuclein domain, is mutated in several families 
with autosomal dominant Parkinson's disease. Gamma-synuclein, which also 
possesses a synuclein domain, is overexpressed in breast and ovarian cancers 
(Lavedan, 1998). 

Retinitis Pigmentosa 

[0143] Retinitis pigmentosa is a group of inherited retinopathies 
characterized by early stage loss of night vision, followed by loss of peripheral vision. 
Defects in any structural or functional proteins associated with the rod photoreceptor 
neurons of the retina, which are the cells that transduce light into a neuronal action 
potential, can lead to the disease (Seigel et aL, 1999). 
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[0144] GTPase regulators have been implicated in the pathology of retinitis 
pigmentosa. GTPase regulators are proteins that determine whether a GTP binding 
protein exists in a GTP-bound or GDP-bound state (Zhao et al., 2003); they are 
described in more detail below. GTPase regulators have a broad spectrum of 
intracellular functions, including intracellular vesicular transport. These proteins 
localize to a specific region of rod photoreceptor cells, in a narrow cilium that 
connects the cell body, where protein synthesis and basic metabolism takes place, 
with the rod outer segment, where light is transduced to an action potential of the 
optic nerve (Zhao et al., 2003). Proteins necessary for the light transduction process 
are made in the cell body and must be transported to the outer segment via vesicular 
transport mechanisms. Mutant GTPase regulators, which regulate vesicular transport, 
play a role in the pathogenesis of retinitis pigmentosa (Roepman et al., 2000). 
Retinitis pigmentosa-related sequences can possess or interact with a Tctex-1 domain, 
which is comprised of a dynein light chain, and can bind to the cytoplasmic tail of 
rhodopsins, which are light-sensing proteins present in retinal rod cells 
(http://pfam.wustl. edu/cgi-bin/getdesc?riame=Tctex-l). Mutations in this domain 
that are responsible for retinitis pigmentosa inhibit this binding. 
A Izheimer 's Disease 

[0145] Alzheimer's disease is a neurodegenerative dementing illness. It is a 
genetically complex disease with multiple forms, including familial and sporadic 
forms, and early onset and late-onset forms. Mutations in at least four genes are 
known to cause Alzheimer's disease, and there is evidence for additional Alzheimer's 
loci (McKusick, 2003). One form of Alzheimer's disease is caused by mutations in 
the amyloid precursor gene, another form is associated with the apolipoprotein E4 
allele, a third form is caused by a mutant presenilin-1 gene that encodes a seven- 
transmembrane domain protein, and a fourth form is caused by a mutant gene 
encoding a similar seven -transmembrane domain protein, presenilin-2 (McKusick, 
2003). . 

[0146] Consistent with its multiple etiologies, multiple clinical presentations, 
and multiple genetic loci, Alzheimer disease has a complex pathology. One facet of 
the pathology of Alzheimer's disease is the formation of amyloid plaques from 
amyloid precursor protein (Clark and Karlawish, 2003). Amyloid precursor protein 
can be processed in vitro by several different proteases such as secretases and 
caspases to yield peptide fragments, suggesting that these proteases may play a role in 
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the formation of pathogenic amyloid plaques in vivo (Suh and Checler, 2002). 
Presenilis have been identified as likely candidates for the proteases that cleave 
amyloid precursor protein to pathogenic peptide fragments in vivo (Selkoe, 2001). 
Another facet of Alzheimer's disease pathology is an inflammatory component 
mediated by microglial cells, the brain's primary immunoeffector cells (Tan et al., 
1999). Microglial cells are attracted to and activated by amyloid deposits; they 
release inflammatory mediators that promote the aggregation of the deposits into 
plaques, and also directly induce or promote neurodegeneration (Hoozemans et al., 
2002). Therefore, current treatment strategies include anti-inflammatory and 
immunotherapeutic approaches, including vaccines (Weiner and Selkoe, 2002). 

[0147] Alzheimer's disease-related sequences can possess or interact with 
trypsin domains, which demonstrate a wide range of peptide degrading activities, 
including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activities 
(http://pfam. wustl.edu/cgi-bin/getdesc?name=trypsin). Alzheimer's disease-related 
sequences can also possess or interact with low-density lipoprotein receptor (ldl_rece) 
domains, Which are characterized by seven successive cysteine-rich repeats of about 
40 amino acids at the N-terminal region, and which are also present in receptors for 
low density lipoprotein (LDL), the major cholesterol-carrying lipoprotein of plasma 
(http://pfam.wusu.edu/cgi-^^ 

sections =DE&sections=CC&size=100). Alzheimer's disease-related sequences can 
also possess or interact with a PT repeat (pt_a) domain, which includes the 
tetrapeptide XPTX, or a similar, conserved, sequence. 
Williams-Beuren Syndrome 

[0148] Williams-Beuren syndrome is a complex genetic developmental 
disorder with multisystemic manifestations, and variability in its presentation. In 90- 
95% of the cases reported, a gene deletion occurs at the 7ql 1 .23 location on the long 
arm of chromosome 7; in the remaining cases, a variety of other chromosomal 
deletions and translocations have been observed (Wang et al., 1999). The most severe 
cases are characterized by cardiac anomalies, including aortic stenosis, mental 
retardation, growth deficiency, a characteristic facial appearance, dental 
malformation, and infantile hypercalcemia (Lashkari et al., 1 999). 

[0149] The underlying molecular basis for the syndrome is die absence of 
the proteins encoded by the genes of the affected region of the chromosome. A 
missing elastin gene, with resulting extracellular matrix anomalies, is a consistent 
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finding. Other genes that are present in and near the commonly deleted region of 
chromosome 7, and thus are likely to contribute to pathogenesis, are (1) a gene 
encoding a regulator of chromosome condensation-like G-exchanging factor, which is 
a factor that exchanges nucleotides for small GTP-binding proteins, (2) an N- 
acetylgalactosaminyltransferase, (3) a DNAJ-like chaperone, (4) NOLl/NOP2/sun 
domain-containing proteins, including a novel protein designated WBSCR20, which 
is expressed in skeletal muscle, and is similar to a 120 kilodalton proliferation- 
associated nucleolar antigen, (5) a methyltransferase designated WBSCR22, and (6) 
other proteins with no known homologies (Merla et al., 2002; Doll and Grzeschik, 
2001). Williams-Beuren-related sequences can possess or interact with a GTF2I-like 
repeat (GTF2I) domain, which is a DNA binding domain commonly deleted in 
Williams-Beuren syndrome, (http://pfam.wustl.edu/cgi-bin/getdesc?name=GTF2I). 
Rlieumatic Diseases 

[0150] Rheumatic diseases are inflammatory conditions that can have 
autoimmune, infective, or traumatic origins. They include arthritis, systemic lupus 
erythematosus, scleroderma, and Sjogren's syndrome. Arthritis refers to any 
inflammation of a joint. Systemic lupus erythematosus is an autoimmune disease in 
which patients produce antibodies to their own tissues, resulting in an inflammatory 
process that can damage organs. Scleroderma can present as systemic scleroderma, a 
chronic, progressive disease that is characterized by hardening and stiffening of the 
skin and damage to internal organs, e.g., heart, lungs, kidneys and esophagus. 
Sjogren's syndrome is a progressive immunological disorder characterized by 
inflammation and the subsequent destruction of exocrine glands, e.g., salivary glands, 
sweat glands, and lacrimal (tear) glands. 

[0151] The serum of patients with scleroderma and Sjogren's syndrome have 
antibodies directed against a protein that is a normal component of the Golgi 
apparatus (Seelig et al., 1994), an intracellular organelle composed of a stack of 
flattened cisternae with associated transport vesicles. The Golgi apparatus sorts 
proteins and sends them to their correct intracellular destination. This antigenic 
protein is a "golgin," one of a class of molecules characterized by an integral 
membrane domain and a large cytoplasmic region. Golgins organize the Golgi 's 
structure, and influence protein sorting (Gillingham et al., 2002). Golgins function in 
a variety of ways, including cross-bridging Golgi cisternae to one another (Linstedt 
and Hauri, 1993) and tethering Golgi transport vesicles to the cisternal membranes 
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(Shorter et aL, 2002). Rheumatic disease-associated sequences can possess or interact 
with golgin-97, RanBP2alpha, Imhlp, and p230/golgin (GRIP) domains, which are 
found in many large cbiled-coil proteins, are sufficient for targeting to the Golgi, and 
have a conserved tyrosine residue (http://pfam.wustl.edu/cgi-bin/getdesc? 

name=GRIP). 

Disintegrin-Related Sequences 

[0152] Disintegrins are proteins that interfere with the function of 
integrins. Disintegrins are generally proteins of about 70 amino acid residues that 
contain multiple disulfide bonds, bind with high affinity to a subset of integrins, and 
interfere with integrin binding to physiological ligands. Examples of disintegrin- 
related sequences include snake venoms and related proteins, cysteine-rich 
metalloproteinases and related non-enzymatic sequences, e.g., those expressed in the 
male reproductive tract, and membrane-anchored metalloproteinases with diverse 
functions, e.g., the shedding of cell-surface proteins such as cytokines and cytokine 
receptors, and the conferring of asthma susceptibility (Van Eerdewegh et aL, 2002; 

Perry et aL, 1995). 

[01 53] Disintegrin-related sequences can possess or interact with 
disintegrin domains, which contain an Arg-Gly-Asp sequence, a sequence commonly 
found in adhesion proteins (http://pfam.wustl.edu/cgi-bin/getdesc?name=disintegrin). 
Proteins that comprise both disintegrin and metalloproteinase peptidase domains 
include ADAM proteins. Disintegrin-related sequences can also possess or interact 
with reprolysin family propeptide (Pep_M12B_propep) domains, which are domains 
that include the propeptide sequence of members of the peptidase family M12B, and 
contain a sequence motif similar to a sequence found in matrixin proteins 
(http://pfam.wustl.edu/ cgi-bin/getdesc?name=Pep_M12B_ propep). 

Factor-Related Sequences 

[01 54] A factor is any molecule that contributes to a bodily process. Factors 
can function in specific biochemical reactions and cellular functions. There are many 
categories of factors, and factors are involved in many, if not all, physiological and 
pathological processes. Some exemplary factors are described in the following 
paragraphs; they are not exhaustive of the category. 

[0155] Transcription factors are factors that initiate or regulate transcription 
in eukaryotes. They include gene regulatory proteins, which turn specific sets of 
genes on or off, and general transcription factors, which assemble at the promoter 
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region to enable and regulate transcription of many genes. They also include 
transcription elongation factors, which are proteins required for the addition of amino 
acids to growing polypeptide chains on ribosomes (Alberts et aL, 1994). 
Transcription factors interact with a wide variety of molecules, including DNA 
binding proteins, polymerases, regulatory molecules such as kinases, and specific 
regions of DNA, e.g., promoters, and enhancers (Alberts et aL, 1994; Vallejo et aL, 

1993) . 

[01 56] Translation factors, including translation initiation factors and release 
factors, are involved in initiating and regulating the rate of protein synthesis. They 
also interact with many molecules, including ribosomal proteins, mRNA, and 
molecules that regulate the incorporation of amino acids into protein, such as kinases 
and GTP (Price et aL, 1993; Alberts, 1 994). 

[0157] Export factors are involved in the export of molecules, e.g., RNA, 
from the nucleus (Stutz et aL, 2000). Folding factors are involved in the process of 
folding proteins into their functional three dimensional shapes, and are also involved 
in receptor function (Gao et aL, 1994). Factors such as activators and coactivators 
interact with nuclear receptors to modulate cellular processes, e.g., transcription 
(Mahajan et aL, 2002). 

[0158] ADP-ribosylation factors are involved in the addition of an ADP- 
ribose group donated from nicotinamide adenine dinucleotide (NAD) to specific 
amino acid residues in heterotrimeric G-proteins. They are involved in, for example, 
normal cellular processes, such as vesicular transport, and also in the pathologic states 
induced by cholera, pertussis, and botulinum toxins (Alberts et aL, 1994; Amor et aL, 

1994) . Guanine nucleotide exchange factors bind to small G-proteins, such as Ras, 
and displace GDP in favor of GTP. They act as effectors or modulators of small G- 
proteins (Ehrhardt et aL, 2001; Janeway et aL, 2001; Shao and Andres, 2000). 

[0159] Factor-related sequences can possess or interact with ADP- 
ribosylation factor family (arf) domains, which are GTP-binding domains involved in 
protein trafficking (http://pfam.wustl.edu/cgi-bin/getdesc?name==arf). Factor-related 
sequences can also possess or interact with elongation factor Tu GTP binding 
(GTP_EFTU) domains, which are elongation factors that promote the GTP-dependent 
binding of aminoacyl tRNA to ribosomes during protein biosynthesis, and catalyze 
the translocation of the newly synthesised protein chain (http://pfam.wustl.edu/cgi- 
bin/getdesc?name=GTP_EFTU). Factor-related sequences can also possess or 
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interact with 4F5 protein family (4F5 ) domains, which comprise ubiquitously 
expressed short proteins rich in aspartate, glutamate, lysine and arginine 
(http://pfam.wustl.edu/cgi-bin/getdesc?name=4F5). Factor-related sequences can also 
possess or interact with eukaryotic initiation factors, e.g., eukaryotic initiation factor 
4E (IF4E), which recognizes and binds mRNA during an early step of protein 
synmesis(>ttp://pfam.wustl.edWcgi-bin/getdesc?name=IF4E). 
Germ Cell Specific Protein-Related Sequences 

[0 1 60] Germ cells, also called gametes, are cells that contribute to a 

new generation of organisms by giving rise to either an egg or a sperm. They are 
haploid cells specialized for sexual fusion. Proteins that are specific to germ cells can 
be found at one or more developmental stages of gametes. 

[0161] Germ cell-related sequences include germ cell genes and then- 
gene products, their regulators and effectors, genes and gene products affected in 
disorders associated with germ cells, and antibodies that specifically recognize or 
modulate germ cell-related sequences. Examples of germ cell-related sequences 
include the germ cell-specific Y-box binding protein and contrin. Germ cell specific 
protein-related sequences possess or interact with the cold-shock DNA-binding (CSD) 
domain, which is described above. 

Growth Factor-Related Sequences 

[0162] A growth factor is an extracellular polypeptide signaling molecule 
that stimulates a cell to grow or proliferate. Many types of growth factors exist, 
including protein hormones and steroid hormones. Some growth factors have a broad 
specificity, and some have a narrow specificity. Examples of growth factors with 
broad specificity include platelet-derived growth factor, epidermal growth factor, 
insulin like growth factor I, transforming growth factor B, and fibroblast growth 
factor, which act on many classes of cells. Examples of growth factors with narrow 
specificity include erythropoeitin, which induces proliferation of precursors of red 
blood cells, interleukin-2, which stimulates proliferation of activated T-lymphocytes, 
interleukin-3, which stimulates proliferation and survival of various types of blood 
cell precursors, and nerve growth factor, which promotes the survival and the 
outgrowth of nerve processes from specific classes of neurons. 

[01 63] Most growth factors have other actions in addition to inducing cell 
growth or proliferation, e.g., they may influence survival, differentiation, migration, 



46 



WO 2005/005597 



PCT/US2003/027106 



or other cellular functions. Growth factors can have complex effects on their targets, 
e.g., they may act on some cells to stimulate cell division, and on others to inhibit it. 
They may stimulate growth at one concentration, and inhibit it an another. Growth 
factors are also involved in tumorogenesis. 

[0164] Growth factor related sequences include sequences associated with 
the process of stimulating cell growth or proliferation by a growth factor. For 
example, they include intracellular effectors of growth, such as components of 
intracellular pathways that respond to growth factors (Kothapalli et al., 1997; Wax et 
al., 1994), sequences that bind directly or indirectly to growth factors (Van den 
Berghe et al., 2000), and sequences affected as a result of growth factor action. 

[01 65] Growth factor-related sequences can possess or interact with a 
transforming growth factor beta like (TGF-beta) domain, which is a multifunctional 
peptide sequence that controls proliferation, differentiation and other functions in 
many cell types (http://pfam. wustl.edu/cgi-bin/getdesc?name=TGF-beta). Growth 
factor-related sequences can also possess or interact with a fibroblast growth factor 
(FGF) domain, which is found in a family of proteins involved in growth and 
differentiation (http://pfam.wustl.edu/cgi-bin/getdesc? name=FGF). 

GTPase-Related Sequences 

[0166] GTPases are enzymes that catalyze GTP hydrolysis, and 
comprise a large family of proteins with a similar globular GTP binding domain. 
When GTP is bound to a GTPase, it is hydrolyzed to GDP, and the domain undergoes 
a conformational change that inactivates the protein. GTPases are regulated by 
GTPase regulators, proteins that determine whether a GTP binding protein exists in a 
GTP-bound or GDP-bound state (Zhao et al., 2003). GTPase regulators include 
GTPase activating proteins, which bind the GTPase and induce it to hydrolyze its 
bound GTP to GDP; the GTPase remains in an inactive, GDP-bound state until it 
encounters a guanine nucleotide releasing protein, which binds to the GTPase and 
causes the release of the nucleotide. GTPases have a broad spectrum of intracellular 
functions, including intracellular vesicular transport. Examples of GTPase-related 
sequences include ras, GTPase-activating proteins, and guanine nucleotide releasing 
proteins. 

[0 1 67] GTPase-related sequences can possess or interact with GTPase 
activator protein for Ras-like GTPase (RasGAP) domains, which are protein domains 
of about 250 residues that accelerate the GTPase activity of ras 
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(http://pfam.wu^^ GTPase-related sequences 

can also possess or interact with putative GTPase activating protein for ARF (ArfGap) 
domains, which are protein domains with a zinc finger involved in intermolecular 
associations (http://pfam.wusd.edu/cgi-bin/getdesc?name=ArfGap). GTPase-related 
sequences can also possess or interact with ankyrin repeat domains (ank), which are 
tandemly repeated modules of about 33 amino acids found in a variety of functionally 
diverse proteins (http://pfam.wustl.edu/cgi-bin/getdesc?name=ank). GTPase-related 
sequences can also possess or interact with pleckstrin homology (PH) domains, which 
are protein domains of about 100 residues involved in intraceUular signaling, or as 
components of the cytoskeleton (http:// P fam.wustl.edu/cgi-bin/getdesc?name=PH). 
Heat-Shock Protein-Related Sequences 

[0168] Heat-shock proteins, also referred to as stress-response proteins, are 
proteins that are synthesized in response to an elevated temperature or other cell 
stressor, and help the cell withstand environmental insults. A cell stressor can induce 
a battery of genes that encode gene products that protect the cell from the result of the 
insult e.g., proteins that stabilize and repair partially denatured cell proteins. Some 
heat-shock proteins, e.g., chaperones, are present at high levels in unstressed cells, 
and further induced by stress. Chaperones assist other proteins in attaining then 
proper secondary and tertiary structures. For example, members of the tubulin- 
specific chaperone A family possess tubulin-specific chaperone A (TBCA) domams 
that fold tubulin polypeptides into their functional configuration 
(http://pfam.wustl.edu/cgi-bin/getdesc?name=TBCA). 

[0169] Heat and other stressors further induce the synthesis of a family of 
90-kDa heat-shock proteins that are already abundant in unstressed cells (Pepin et aL. 
2001;Lees-Milleretal., 1989; Rebbe etal., 1987). Members of this family possess a 
hsp 90 protein (HSP90) domain that interacts with tubulin, actin, tyrosine kinase 
oncogene products of retroviruses, eIF2alpha kinase, and steroid hormone receptors 
(Lees-Miller and Anderson, 1989). This domain includes a highly-conserved N- 
terminal region, separated from a conserved, acidic C-terminal region by a highly- 
acidic, flexible linker region (http://pfam. wustl.edu/cgi-bin/getdesc7name-HSP90). 

[0170] Another family of heat-shock proteins, the hsp70 proteins, have an 
average molecular weight of 70 kDa; some members of this family are only expressed 
under conditions of stress, while some are present in cells under normal conditions. 
Hs P 70 proteins reside in different cellular compartments, e.g., the nucleus, cytosol, 
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mitochondria, and endoplasmic reticulum. Hsp70 proteins, e.g., Hsc73, can be 
differentially expressed at different stages of development (Soulier et al., 1996). 
Hsp70 proteins, e.g., the chaperone hsp70-like dnaK protein, can associate with 
proteins that possess a DnaJ domain, which comprises an N-terminal conserved 
domain of about 70 amino acids, a glycine-rich region of about 30 amino acids, a 
central domain containing four repeats of a CXXCXGXG motif, and a C-terminal 
region of 120 to 170 amino acids (http://pfam.wustl.edu/cgi-bin/getdesc? 
name=DnaJ). Proteins with DnaJ domains can be postranslationally modified by 
farnesylation (Andres et al., 1997). 
Helicase-Related Sequences 

[0171] Helicases are enzymes that use energy from the hydrolysis of 

ATP to unwind the DNA helix at the replication fork, allowing the single stands to be 
copied. Proteins with DNA helicase activity play roles in DNA replication, repair, 
and recombination. Disorders associated with helicases include Xeroderma 
pigmentosum, Cockayne syndrome, diffuse collagen disease, alpha-thalassemia, 
Bloom syndrome, Werner syndrome, and Rothmund-Thomson syndrome (Miyajima, 
2002). Examples of helicases include RNA helicases, RECQL4, and 
minichromosome maintenance helicase. 

[0 1 72] Helicase-related sequences can possess or interact with helicase 
associated (HA) domains, which are protein domains comprising alpha helices that 
may bind to nucleic acids (http://pfam.wustl.edxi/cgi-bin/getdesc?name=HA). 
Helicase-related sequences can also possess or interact with helicase conserved C- 
terminal (helicase_C) domains, which are protein domains that are found in a subset 
of helicases designated the DEAD/H helicases (http://pfam.wustl.edu/ cgi- 
bin/getdesc?name=helicase_C). 

Hydrolase-Related Sequences 

[01 73] Hydrolases are enzymes that catalyze the hydrolysis of a variety 
of bonds, such as esters, glycosides, and peptides. Hydrolases split a molecule into 
fragments by adding water; the water's hydrogen atom is incorporated into one 
fragment, and the hydroxyl group is incorporated into another. Hydrolases are 
involved in a wide range of physiological and pathological processes, including 
proteolysis, phosphatase activity, and sugar metabolism. Examples of hydrolases 
include protein hydrolases, lipid hydrolases, nucleic acid hydrolases, and small 
molecule, e.g., coenzyme A, hydrolases (Hawes et al., 1996). 
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[0174] Hydrolase-related sequences can possess or interact with 

alpha/beta hydrolase fold (abhydrolase) domains, which are catalytic domains found 
in a wide range of hydrolytic enzymes of different phylogenetic origins and catalytic 
functions (http://pfam.wustl.edu/cgi -bin/getdesc?name=abhydrola S e). Hydrolase- 
related sequences can also possess or interact with dUTPase domains, which are 
proteins domains that hydrolyze dUTP to dUMP and pyrophosphate. 

Immune Cell-Related Sequences 

[0175] An immune cell is a cell involved in, or associated with, the immune 
system. Immune cells include cells in the myeloid and lymphocytic arms of the 
immune response, as well as their precursors, Immune cells also include cells at all 
stages in the differentiation pathways that produce cells associated with the immune 
system. These cells can reside, either permanently or temporarily, in the spleen, 
lymph nodes or mucosal-associated lymphoid tissues (MALT). Immune cell-related 
sequences are involved in all functions of the immune response, e.g., antibody 
production and cell-mediated immunity, and can function at any point in time, ranging 
from the embryonic formation of the immune system, through the time of an immune 
challenge, to many decades later, e.g., when a B-cell memory response is invoked 
(Janeway, 2001). 

[0176] Immune-cell related sequences of differentiating immune cells 
include pre-B cells that do not produce immunoglobulin light chain, but express a 
transcript homologous to immunoglobulin lambda light-chain genes, the expression of 
which is limited to pre-B cells and select other cells that have no surface 
immunoglobulin (Hollis et al., 1989). Immune-cell related sequences of activated 
immune cells include a B-cell-restricted transcription factor expressed by activated B 
cells; its expression pattern suggests it has a role in regulating B-cell differentiation 

(Massari et al., 1998). 

[0177] Examination of the expression of immune-cell related sequences can 
detect and diagnose immunoregulatory abnormalities. For example, genes that 
encode proteins which mediate the combinatorial process that combines a finite 
number of component genes into the very broad range of antigen-specific 
immunoglobulin and T-cell binding proteins, are expressed at higher levels in patients 
W dth systemic lupus erythematous (SLE) than in healthy subjects (Girschick et al., 
2002). 
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[01 7S] Immune cell-related sequences can possess or interact with a CUB 
domain, which is an extracellular domain of approximately 110 amino acids, and is 
present in functionally diverse, including developrnentally regulated, proteins 
(http://pfam.wustl.edu/ cgi-bin/getdesc?name=CUB). Immune cell-related sequences 
can also possess or interact with a CD-20 domain, which has four transmembrane 
regions, both extracellular and cytoplasmic extensions, and is found, inter alia, in a 
high affinity IgE receptor (http://pfam.wustl.edu/cgi-bin/getdesc?name=CD20). 
Immune cell-related sequences can also possess or interact with an interferon-induced 
transmembrane protein (CD225) domain, which is found in a family of proteins that 
includes the human leukocyte antigen CD225, an interferon-inducible transmembrane 
protein associated with interferon-induced cell growth suppression (http://pfam.wustl. 
edu/cgi-bin/getdesc?name=CD225). Immune cell-related sequences can also possess 
or interact with sushi domains, also known as complement control protein (GCP) 
modules, or short consensus repeats (SCR). These domains are found in a wide 
variety of complement and adhesion proteins, including proteins responsible for the 
antigenicity of blood group antigens on the external face of the red blood cell 
membrane (http://pfam.wustl.edu/cgi-bin/getdesc?name=sushi). Immune cell-related 
sequences can also possess or interact with SH2 domains and rvt domains; both are 
described above. 

Integrase-Related Sequences 

[01 79] Integrases are enzymes that form proviruses by inserting a linear 
double-stranded DNA copy of a retroviral genome into host cell DNA. Examples of 
integrases include HIV integrase, PhiC31 integrase, and Sip. 

[01 80] Integrase-related sequences can possess or interact with an 
integrase zinc binding domain (IntegraseJZn) domain, which is a zinc binding protein 
domain placed near the N-terminus (http://pfam.wustl.edu/cgi-bin/getdesc? 
name=Integrase_Zn). Integrase-related sequences can also possess or interact with an 
integrase core (rve) domain, which is a protein domain that forms the central catalytic 
core of the integrase (http://pfam.wustl.edu/ cgi-bin/getdesc?name=rve). This domain 
acts as an endonuclease to cleave the nucleotide and catalyzes the transfer of the viral 
DNA strand to the integration site of the host DNA. Integrase-related sequences also 
possess or interact with an integrase DNA binding (integrase) domain, which is a 
DNA-binding protein domain near the C-terminus (http://pfam.wustl.edu/cgi- 
bin/getdesc?name=integrase). Integrase-related sequences also possess or interact 
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reverse transcriptase (rvt) domains, which are described above. Integrase-related 
sequences also possess or interact with a RNase H domain, which is a protein domain 
that hydrolyzes the RNA portion of RNA/DNA hybrids (http://pfam.wustl.edu/cgi- 
bin/getdesc?name=rnaseH). 

Integrin-Related Sequences 

[0181] Integrins are transmembrane proteins that mediate cell to cell as 
well as cell to matrix adhesion, and provide a means of communication between the 
interior of a cell and the extracellular matrix. The extracellular portion of integrins 
binds to components of the extracellular matrix, e.g., collagen, fibronectin and 
laminin. The intracellular portion of integrins interacts with the cell cytoskeleton, 
e.g., actin filaments near the cell surface. Integrins transmit information about the 
extracellular environment across the plasma membrane to the cytoskeleton, where it is 
available to intracellular signaling mechanisms (Alberts et al., 1994). Structurally, 
integrins consist of heterodimers of an alpha and a beta subunit. Each subunit has a 
large N-terminal extracellular domain followed by a transmembrane domain and a 
short C-terminal cytoplasmic region. The pairing of certain alpha subunits with 
certain beta-subunits determines ligand specificity, localization and function. The 
extracellular binding domains of integrins often bind their ligands with low affinity; 
simultaneous, weak, binding with multiple matrix molecules provides the cell with a 
means to sense its complex, changing, extracellular environment without becoming 
glued to it. Examples of integrin-related sequences include integrin alpha and beta 
subunits, collagens, and integrin-linked kinase (Zhang et al., 2002). 

[0182] Integrin-related sequences can possess or interact with von 

Willebrand factor type A (vwa) domains, which are protein domains that participate 
in diverse biological functions, e.g., cell adhesion, migration, homing, pattern 
formation, and signal transduction (http://pfam.wustl. edu/cgi-bin/getdesc? 
name=vwa). Integrin-related sequences can also possess or interact with FG-GAP 
repeat (FG-GAP) domains, which are protein domains present in the vicinity of ligand 
binding domains at the N-terminus of integrin alpha subunits (http://pfamAVUstl.edu/ 
cgi-bin/getdesc?name=FG-GAP). 

Interacting Protein-Related Sequences 

[0183] An "interacting protein" is a protein that interacts with another 
molecule. Interacting proteins are involved in every aspect of cellular function. 
Interacting proteins have been characterized in all known locations in the cell, and 
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include all, or most types of, proteins. Interacting proteins in the nucleus regulate 
such diverse functions as apoptosis, transcription, homologous recombination, and 
DNA repair. Nuclear fibroblast growth factor-2 interacting factor interacts with 
fibroblast growth factor 2 to prevent apoptosis (Van den Berghe et al., 2000). Grap2 
cyclin-D interacting protein (GCIP) a nuclear cell-cycle protein, inhibits select 
transcriptional events, and reduces the leve 1 of phosphorylation of nuclear 
retinoblastoma protein (Chang et al., 2000). Pir 51, a human homologue of Rec A, a 
bacterial enzyme that mediates genetic recombination, interacts with the enzyme 
rad5 1 to regulate homologous recombination and DNA repair in mammalian cells 
(Kovalenko et al., 1997). Hepatitis B virus X-associated protein (HBXAP), a protein 
demonstrated to play a role in the development of hepatocelluar carcinoma, interacts 
with the hepatitis B virus regulatory gene product HBx to increase viral transcription 
(Shamay et al., 2002). 

[01 84] Interacting protein-related proteins can utilize many protein domain 
motifs for interaction. They can possess or interact with domains that mediate 
interaction with DNA, RNA, ions, or other proteins. For example, PDZ domains, 
which are also known as DHR or GLGF domains, target signaling molecules to 
membranes and mediate the assembly of functional membrane domains (Fanning and 
Anderson, 1999). Interacting protein-related proteins can also possess or interact with 
rrm domains, which are described above. 

Isomerase-Related Sequences 

[0185] Isomerases are enzymes that convert molecules into their 
positional isomers, i.e., into molecules with the same chemical formula but a different 
stereochemical arrangement of atoms. Isomerases act on a wide variety of molecules, 
including sugars, amino acids, and nucleic acids. They are involved in a wide range 
of physiological and pathological functions, including those involving metabolic and 
synthetic pathways. 

[0186] Isomerase-related sequences include isomerase genes and gene 

products, their substrates, products, activators, inhibitors, effectors, and cofactors, 
regulatory molecules that modulate their function, genes and gene products affected in 
disorders associated with isomerases and antibodies that specifically recognize or 
modulate isomerase-related sequences. Examples of isomerase-related sequences 
include triosephosphate isomerases, peptidyl-prolyl isomerases, glucose phosphate 



53 



PCT/US2003/027106 

WO 2005/005597 

isomerases, disulfide isomerases, ketosteroid isomerases, and ribosyltransferase- 

isomerases (Brown et al., 1985). 

[0187] Isomerase-related sequences can possess or interact with 

triphosphate isomerase (TIM) domains, which are protein domains that catalyze 
the reversible interconversion of glyceraldehyde 3-phosphate and dihydroxyacetone 
phosphate (http:// P fam.wustl.edu/cgi-bin/getdesc?name=TIM). Isomerase-related 
sequences can also possess or interact with cyclophilin type peptidyl-prolyl cis-trans 
isomerase (pro.isomerase) domains, which accelerate protein folding by catalyzmg 
the cis-trans isomerization of peptide bonds (http://pfam.wustl.edu/ 
cgibin/getdesc?name=pro_ isomerase). 
Mucin-Related Sequences 

[0188] The term mucin refers to both an albumin-like substance that is 

present in mucus, and to transmembrane proteins that can typically be produced in 
both soluble and transmembrane forms. Soluble mucins comprise mucus gels that 
protect epithelial cells in the airways, digestive tract, and other organs, and are found 
m body fluids, such as milk, tears, and saliva. In their transmembrane forms, mucms 
provide a steric barrier to protect the apical surface of epithelial cells. 
Transmembrane mucins are also involved in pathogenesis; for example, they medrate 
viral entry into cells, promulgate the inflammatory response, and are involved m the 
regulation of abnormal cell proliferation (Jeffery and Zhu, 2002; Tsuda et al., 1993). 
Examples of mucins include MUC2 mucin, mucin carcinoembryonic antigen, and 
Muc3 membrane bound intestinal mucin. 

[0189] Mucin -related sequences can possess or interact with mucin-like 
glycoprotein (tryp_mucin) domains, which are domains that are involved in the 
interaction of parasites with host cells (http://pfam.wustl.edu/cgi- 
bin/getdesc?name=Tryp_mucin). Mucin-related sequences can also possess or 
interact with multi-glycosylated core protein (MGC-24) domains, which are protein 
domains of sialomucins that are expressed in many normal and cancerous ussues 
(http://pfam.wustl.edu/cgi-bin/getdesc?name=MGC-24). 
Other Polypeptide-Related Sequences 

[0190] In addition to the sequences described above, the sequences of 

the invention include nucleotide and amino acid sequences, some with known 
function, and some with unknown function, that fall into a broad array of categones. 
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[0191] Polypeptide-related sequences of the invention can possess or 
interact with groucho/TLE N-terminal Q-rich (TLE_N) domains, which are protein 
domains found in co-repressor proteins, and are involved in oligomerization 
(>ttp://pfam.wustl.edu/cgi-bin/getdesc?name=TLE_N). Polypeptide-related 
sequences of the invention can also possess or interact with uncharacterized protein 
family 0160 (UPF0160) domains, which are protein domains found in proteins that 
include multiple metal-binding residues, and in some cases act as a phosphodiesterase 
(http://pfam.wustl.edu/cgi-bin/getdesc?name=UPFO 1 60). Polypeptide-related 
sequences of the invention can also possess or interact with SNF7 domains, which are 
protein domains involved in protein sorting and transport from the endosome to the 
lysosome or vacuole of eucaryotic cells (http://pfam.wustl.edu/cgi-bin/getdesc? 
name=SNF7). Polypeptide-related sequences of the invention can also possess or 
interact with NifU-like N-terminal (NifU_N) domains, which are protein domains 
involved in nitrogen fixation, and other functions (http://pfam.wustl.edu/cgi- 
bin/getdesc? name=NifU_N). Polypeptide-related sequences of the invention can also 
possess or interact with tRNA synthetases class II (D, K, and N) (tRNA-synt_2) 
domains, which are protein domains that activate the amino acids asparagines, 
aspartic acid, and lysine, and transfer them to specific tRNA molecules 
(http://pfam.wustl.edu/cgi-bin/getdesc?nmne^RNA-synt__2). 

[0192] Polypeptide-related sequences of the invention can also possess 
or interact with dynein heavy chain (dynein_heavy) domains, which are protein 
domains that correspond to the C-terminal region of the dynein heavy chain 
(http://pfam.wustl.edu/cgi-bin/getdesc?name=Dynein_heavy). Polypeptide-related 
sequences of the invention can also possess or interact with cyclin-dependent kinase 
regulatory subunit (CKS) domains, which are protein domains of approximately 79- 
150 amino acid residues that are involved in regulating progression through the cell 
cycle (http://pfam.wustl.edu/cgi-bin/getdesc?name= CKS). 

[0193] Polypeptide-related sequences of the invention can also possess 
or interact with nucleoside diphosphate linked to some other moiety X (NUDDC) 
domains, which are protein domains that are involved in removing oxidatively 
damaged nucleotides 0^ttp.V/pfam.wstl.edu/cgi-bin/getdesc?name=NUDIX). 
Polypeptide-related sequences of the invention can also possess or interact with T- 
complex protein/cpn60 chaperonin (cpn60_TCPl) domains, which are protein 
domains involved in protein folding and oligomerization (http://pfam.wustl.edu/cgi- 
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bi n /getdesc?name=cpn60_TCPl). Polypeptide-related sequences of the invention can 
also possess or interact with F-actin capping protein, beta subunit (F_actin_cap_B) 
domains, which are protein domains of approximately 280 amino acids that are 
involved in capping actin, i.e., blocking the exchange of actin monomers (http://pfam. 
wustl.edu/cgi-bin/getdesc?name=F_actin_cap_B). 

[0194] Polypeptide-related sequences of the invention can also possess 
or interact with G-protein alpha subunit (G-alpha) domains, which are protein 
domains that bind guanyl nucleotides, and function as a GTPase (http://pfam.wustl. 
edu/cgi-bin/getdesc? name=G-alpha). Polypeptide-related sequences of the invention 
can also possess or interact with Kruppel-associated box (KRAB) domains, which are 
protein domains involved in protein-protein interactions, and present in some zinc 
finger proteins (http://pfam.wustl.edu/ cgi-bin/getdesc?name=KRAB). Polypeptide- 
related sequences of the invention can also possess or interact with metallopeptidase 
family M24 (PeptidaselM24) domains, which are protein domains that are found in 
some metalloproteases, including proline dipeptidase, and methionine aminopeptidase 
(htt P ://pfam.wstl.edu/cgi-bin/getdesc?name=Peptidase_M24). Polypeptide-related 
sequences of the invention can also possess or interact with thioredoxin (thiored) 
domains, which are protein domains involved in oxidation/reduction reactions by 
reversibly oxidizing disulfide bonds (http://pfam.wustl.edu/cgi-bin/getdesc? 
name==thiored). 

[0195] Polypeptide-related sequences of the invention can also possess 
or interact with TUDOR domains, which are protein domains involved in the 
formation of primordial germ cells, and for normal abdominal segmentation 
(http://pfam.wustl.edu/cgi-bin/getdesc7name =TUDOR). Polypeptide-related 
sequences of the invention can also possess or interact with SIT4 phosphatase- 
associated protein (SAPS) domains, which are protein domains that are involved in 
cyclin transcription (http://pfam.wustl.edu/cgi-bin/getdesc?name=SAPS). 
Polypeptide-related sequences of the invention can also possess or interact with 
ankyrin repeat (ank) domains, which are protein domains of approximately 33 amino 
acids, and are sometimes found in tandemly repeated modules (http://pfam.wustl.edu/ 
cgi-bin/getdesc? name=ank). Polypeptide-related sequences of the invention can also 
possess or interact with nicotinamide N-methyltransferase/phenylethanolamine N- 
methyltransferase/ thioether S-methyltransferase (NNMT_PNMT_TEMT) domains, 
which are protein domains that are found in proteins that use S-adenosyl-L- 
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methionine as the methyl donor (http://pfam.wiistl.edu/cgi-biii/getdesc?name= 
NNMT_PNMT_TEMT). Polypeptide-related sequences of the invention can also 
possess or interact with Clq domains, which are protein domains involved in 
activating the serum complement system (http://pfam.wustl.edu/cgi-bin/getdesc? 
name=C 1 q). Polypeptide-related sequences of the invention can also possess or 
interact with collagen triple helix repeat (Collagen) domains, which are protein 
domains that typically form extracellular connective tissue (http://pfam.wustl.edu/cgi- 
bin/getdesc? name=Collagen). 

[0196] Polypeptide-related sequences of the invention can also possess 
or interact with the hyaluronan/mRNA binding family (HABP4JPAI-RBP1) domain, 
which is a protein domain that can bind to the glucosaminoglycan hyaluronan, and to 
RNA(http://pfam.wustl.edu/cgi-bin/getdesc?name=HABP4_^ 
Polypeptide-related sequences of the invention can also possess or interact with 
eucaryotic aspartyl protease (asp) domains, which are protein domains that cleave 
peptide bonds; proteins with this domain include pepsins, cathepsins, and rennin 
(http://pfam.wustl.edu/cgi-bin/getdesc?name=asp). Polypeptide-related sequences of 
the invention can also possess or interact with trypsin domains, which are protein 
domains that function as serine proteases (http://pfam.wustl.edu/ cgi-bin/getdesc? 
name=trypsin). Polypeptide-related sequences of the invention can also possess or 
interact with Kunitz/Bovine pancreatic trypsin inhibitor (Kunitz_BPTI) domains, 
which are protein domains that is found in serine protease inhibitors (http://pfam. 
wustl.edu/cgi-bin/getdesc?name=Kunitz__BPTI). Polypeptide-related sequences of the 
invention can also possess or interact with proliferating cell nuclear antigen, N- 
terminal (PCNA) domains, which are protein domains that are found on non-histone 
. acidic nuclear proteins, and play a role in controlling DNA replication (http://pfam. 
wustl.edu/cgi-bin/getdesc?name=PCNA). 
Oxygen ase-Related Sequences 

[0197] Oxygenases are enzymes that catalyze the incorporation of 

molecular oxygen into organic substances. Dioxygenases, also known as oxygen 
transferases, catalyze the introduction of both atoms of molecular oxygen, and 
typically contain iron. Monooxygenases, also known as mixed function oxygenases, 
introduce one oxygen atom; the other is reduced to water. Examples of oxygenase- 
related sequences include cytochrome oxygenases, heme oxygenases, 
cyclooxygenases, lipoxygenases, and peptide-aspartate beta-dioxygenase. 
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[0198] Oxygenase-related sequences can possess or interact with alkyl 

hydroperoxide rednctase/thiol specific antioxidant (AhpC-TSA) domains, which are 
responsible for providing a defense against sulfur-containing radicals; proteins that 
possess this domain include allergens, e.g., asp f 3, mal f 2, and mal f 3 
(http://pfam.wusu^ Oxygenase-related 
sequences can also possess or interact with monooxygenase domains, which are 
protein domains that utilize flavin adenine dinucleotide (FAD) (http://pfam.wustl. 
edu/cgi-bin/getdesc?name=Monooxygenase). Oxygenase-related sequences can also 
possess or interact with dioxygenase domains, which are protein domains that 
catalyze the incorporation of both atoms of molecular oxygen into substrates 
(http://pfam.wustl.edu/cgi-bin/getdesc?name= Dioxygenase). 
Peroxidase-Related Sequences 

[0199] Peroxidases are enzymes that catalyze the reduction of 

hydrogen peroxide. Peroxidases are generally located within peroxisomes, which are 
intracellular organelles that metabolize fatty acids and toxic compounds. Disorders 
associated with peroxidase-related sequences include X-linked adrenoleukodystrophy. 
Examples of peroxidase-related sequences include glutathione peroxidases,' thiol 
peroxidases, catalases, horseradish peroxidases, anionic peroxidases, and thyroid 
peroxidases. 

[0200] Peroxidase-related sequences can possess or interact with alkyl 
hydroperoxide reductase/thiol specific antioxidant (AhpC-TSA) domains, which are 
protein domains that can reduce organic hydroperoxides (http://pfam.wustl.edu/cgi- 

bin/getdesc? name=AhpC-TSA). 

Phospholipase-Related Sequences 

[0201] Phospholipases are enzymes that act on phospholipids. They 
characteristically generate products that are active in signal transduction pathways. 
For example, phospholipase C hydrolyzes phosphatidylinositol bisphosphate (PIP 2 ) to 
generate the two intracellular mediators, inositol trisphosphate (IP 3 ) and 
diacylglycerol. IP 3 releases Ca 2+ from stores in the endoplasmic reticulum, increasing 
the cytosolic Ca 2+ concentration. Diacylglycerol remains in the plasma membrane 
and activates protein kinase C. 

[0202] Phospholipase activity is involved in the synthesis of eicosanoids, 
inflammatory mediators that include prostaglandins, prostacyclins, thromboxanes, and 
leukotrienes. Corticosteroid hormones, such as cortisone, for example, inhibit 
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phospholipase activity in the first step of the eicosanoid synthesis pathway. 
Corticosteroid hormones are widely used clinically to treat noninfectious 
inflammatory diseases, such as some forms of arthritis (Ribardo et al., 2002). 

[0203] Phospholipids play a pivotal role in the modulation of intestinal 
inflammation. The mucosal surface of the digestive tract functions as a regulatory 
barrier between the gastrointestinal lumen and the underlying mucosal immune 
system. Phospholipids help preserve the mucosa following various forms of injury or 
physiological damage to the lumen, thus preventing invasion of harmful luminal 
factors into the host, which subsequently may lead to inflammation, or a pathological 
immune response, both promoting and inhibiting gastrointestinal inflammation and 
immunity (Sturm and Dignass, 2002). 

[0204] Phospholipase-related sequences can possess or interact with 
lysophospholipase catalytic (PLA2_B) domains, which catalyze the release of fatty 
cids from lysophospholipids (http://pfam;wustl.edu/cgi-bin/getdesc?name=PLA2___B). 
Phospholipase-related sequences can also possess or interact with 
phospholipase/carboxylesterase (abhydrolase_2) domains, which have broad substrate 
specificity (^ttp://pfam.wustl.edxi/cgi-biii/getdesc?name=abhydrolase_2). 
Phospholipase-related sequences can also possess or interact with GDSL-like 
lipase/acylhydrolase (Lipase_GDSL) domains, which are present in lipolytic enzymes 
with serine in the active site (http://pfam.wustl.edu/cgi-bin/getdesc?name= 
Lipase_GDSL). 

Prosaposin-Related Sequences 

[0205] Saposins are small lysosomal proteins that activate lysosomal 

lipid-degrading enzymes, including enzymes that metabolize sphingosine. They 
typically isolate lipids from their membrane surroundings, and increase their 
accessibility to degradative enzymes. Mammalian saposins are synthesized as a 
single precursor molecule, prosaposin, which becomes an active saposin following 
proteolytic activation. Examples of prosaposin-related sequences include saposin A, 
saposin B, and saposin C. Disorders associated with prosaposin-related sequences 
include neurodegenerative diseases similar to similar to Tay-Sachs and Sandhoff 
diseases, e.g., Gaucher's disease, which is described above. 

[0206] Prosaposin-related sequences can possess or interact with 

saposin-A (SAP A) domains; saposin Bl (SapB_l) domains, and saposin B2 (SapB_2) 
domains, which are described above. 
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Proteasome-Related Sequences 

[0207] proteasomes are intracellular complexes that degrade proteins. 
Proteasomes recognize proteins that have been marked for destruction by the addition 
of an ubiquitin molecule, unfold these ubiquitinated proteins, cleave them into small 
peptides of 6-12 amino acids, and release them into the cytosol (Mitch and Goldberg, 
1996). Examples of proteasome-related sequences include 26S proteasome subunits, 
26S proteasome regulatory chains, and ubiquitin. 

[0208] Proteasome-related sequences can possess or interact with 

proteasome/cyclosome repeat (PCrep) domains, which are protein domains that are 
present in regulatory subunits of the proteasome (http://pfam.wustl.edu/cgi- 
bin/getdesc?name= PCrep). Proteasome-related sequences can also possess or 
interact with Mov34/MPN/PAD-l family (Mov34) domains, which are protein 
domains found at the N-terminus of regulatory subunits of the proteasome 
(http://pfam.wustl.edu/cgi-bin/getdesc?name=Mov34). 
Reductase-Related Sequences 

[0209] Reductases are enzymes that catalyze reduction reactions, i.e., 

reactions in which hydrogen is combined with a molecule, or reactions in which 
oxygen is removed from a molecule. Examples of reductases include dehydrogenase 
reductases, oxidoreductases, quinone reductases, CoA reductases, dihydrofolate 
reductases, tetrahydrofolate reductases, carbonyl reductases, nitrate reductases, 
epoxide reductases, NADP(+) reductases, ribonucleotide reductases, and thioredoxin 

reductases (Loeffen et al., 1 998). 

[021 0] Reductase-related sequences can possess or interact with short 
chain dehydrogenase (adh_short) domains, which are present in a wide variety of 
proteins (htto://pfam.wustl.edu/cgi-bin/getdesc?name=adh_short). Reductase-related 
sequences can possess or internet with NADH-Ubiquinone oxidoreductase (complex 
I), chain 5 N-terminus (oxidored_ql_N) domains, which are protein domains that 
catalyze the transfer of electrons from NADH to ubiquinone in a reaction that can be 
associated with proton translocation across a membrane (http://pfam.wustl.edu/cgi- 
bin/getdesc?name=oxidored_qlJ<D- 

Reverse Transcriptase-Related Sequences 

[021 1] Reverse transcriptases are enzymes that make double stranded 
DNA copies from single stranded nucleic acid template molecules. Typically, a 
reverse transcriptase is a DNA polymerase that can copy both RNA and DNA 
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templates, and has an integral RNase H activity (Lim et al., 2002). The two 
enzymatic domains of reverse transcriptase reflect these two activities; the first is a 
DNA polymerase domain that can use either RNA or DNA as a template to synthesize 
either the minus-strand or the plus strand of DNA, and the second is an RNase H 
domain that degrades the RNA in RNA-DNA hybrids (Coffin, 1997; Wu and Gallo, 
1975). 

[02 1 2] Reverse transcriptase plays a role in the replication of some 
viruses, e.g., retroviruses. It copies the retroviral RNA genome to produce a single 
minus strand of DNA, then catalyzes the synthesis of a complementary plus strand. 
Accordingly, reverse transcriptase is a therapeutic target for conditions that involve 
retroviruses, e.g., Aquired Immune Deficiency Syndrome (AIDS). A number of anti- 
retroviral drugs inhibit reverse transcriptase (Frank, 2002). 

[0213] Reverse transcriptase is also a standard scientific research tool in 
the field of molecular biology. The reverse transcriptase polymerase chain reaction 
(RTPCR) amplifies specific DNA sequences rapidly, and in vitro. RTPCR can detect 
trace amounts of RNA and DNA, and is used in a wide range of applications, 
including forensics, the diagnosis of genetic diseases, determination of the prognosis 
of diagnosed diseases, and the detection of viral infection (Alberts, et al., 1994). For 
example, reverse transcriptase is used to diagnose cancer (Rowland, 2002), and to 
provide prognostic information about the predicted survival of patients with prostate 
cancer (Kantoffet al., 2001). 

[0214] An example of a reverse transcriptase is telomerase, a general 
tumor marker with a reverse transcriptase catalytic subunit (Kirkpatrick and Mokbel, 
2001). Most human somatic cells do not express the telomerase reverse transcriptase 
gene; conversely, most cancer cells express this gene (Ducrest et al., 2002; Kyo et al., 
2000 ). The human telomerase reverse transcriptase promoter has been placed in gene 
therapy vectors that specifically target telomerase-positive tumor cells, and spare 
nearby telomerase-negative cells (Pan and Koeneman, 1999). Human telomerase 
reverse transcriptase is also recognized as a tumor antigen that can be a target for 
immunotherapeutic approaches to cancer (Gordan and Vonderlieide, 2002). 

[02 1 5] Reverse transcriptase-related sequences can possess or interact 
with rvt, transposase_22, WD40, and Exo_endo_phos domains, all of which are 
described above. 
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Ribosome-Related Sequences 

[0216] A ribosome is a particle comprised ofribosomal proteins and 
ribosomal RNA that catalyzes protein synthesis from messenger RNA. Ribosomes 
are composed of two subunits, the large (L) subunit and the small (S) subunit. The 
typical mammalian ribosome comprises four RNA molecules and approximately 
eighty different proteins, which are highly conserved among prokaryotes and 
eukaryotes, and perform a variety of tasks related to protein synthesis . e.g., 
coordinating protein .synthesis in a manner that maintains cell homeostasis 
(Yoshihama et al., 2002; Kenmochi et al., 1998). 

[0217] Ribosomal proteins can perform functions independent of their 

involvement in protein synthesis. For example, they are involved in cell-cycle 
progression, e.g., as cell cycle checkpoints, and mediators of homologous 
recombination, embryogenesis, and skeletal development (Yoshihama et al., 2002; 
Chen and Ioannou, 1999). They also contribute to the regulation of cell growth, 
transformation, and death, and can induce apoptosis (Chen and Ioannou, 1999; Naora 
etal.,1999). Mutations in ribosomal proteins are associated with human diseases, 
including Down syndrome, Diamond-Blackfan anemia, Turner syndrome, and 
Noonan syndrome (Yoshihama et al., 2002). 

[0218] Ribosomal proteins have been grouped into protein families on the 
basis of sequence similarities in functional domains. One family of ribosomal 
proteins, the ribosomal protein LI 1 , RNA binding (Ribosomal_Ll 1) domain, is 
comprised of members that possess the LI 1 RNA binding domain; this family 
includes the ribosomal proteins LI 1 and L12, which are components of the large 
subunit. LI 1 is a protein of 140 to 165 amino-acids that binds to a 23S RNA 
molecule, the C-terminal region of which is buried within the ribosomal structure 
(http://pfam.wustl.edu/cgi-bin/getdesc?name=Ribosomal_Ll 1). Another family of 
large ribosomal subunit proteins possess the ribosomal protein L13e 
(Ribosomal_L13e) domain, which is found in a wide range of vertebrates and in 
lower-order species (http://pfam.wusti.edu/cgi-bin/getdesc?nanie=Ribo S omal_L13e), 
as is the ribosomal protein L44 (Ribosomal_L44) domain (http://pfam.wustl.edu/cgi- 
bin/getdesc?name= Ribosomal_L44). 

[0219] Additional ribosomal protein families encompass small subunit 
proteins. The ribosomal protein S6e (Ribosomal_S6e) domain is present in a family 
of proteins which includes protein kinase substrates that control cell growth and 
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proliferation by selectively translating particular classes of mRNA 
(http://pfam.wusti.edii/cgi-bin/getdesc?name== Ribosomal_S6e). The ribosomal 
protein S8e (RibosomalSSe) domain is present in a family of proteins comprising 
approximately 220 amino acids in eukaryotes, and about 125 amino acids in 
archebacteria (http://pfam.wxistl.edu/cgi-bm/getdesc?name=Ribosomal_S8e). The 
ribosomal protein S10p/S20e (Ribosomal_S10) domain is present in a family of 
proteins which includes the small ribosomal subunit S10 from prokaryotes and S20 
from eukaryotes (http://pfam. wustl.edu/cgi-bin/getdesc?name= RibosomalS 10). S 1 0 
is involved in binding transfer RNA to the ribosome, and also operates as a 
transcriptional elongation factor. 

RNase-Related Sequences 

[0220] RNases are enzymes that cleave RNA. RNases generally 

recognize their targets by tertiary structure, rather than by sequence; they include 
exonucleases, which remove the terminal base in an RNA sequence, and 
endonucleases, which can cleave non-terminal bases. Examples of RNases include 
RNase E, which is involved in the formation of 5S ribosomal RNA from pre- 
ribosomal RNA; RNase F, which cleaves both viral and host RNA in response to 
interferons, inhibiting protein synthesis; RNase H, which is specific for the RNA 
strand of an RNA-DNA hybrid; RNase P, which generates transfer RNA from 
precursor transcripts; and RNase T, which removes the terminal AMP from 
nonaminoacylated tRNA (Coffin, et al., 1997). 

[022 1 ] RNase-related sequences can possess or interact with rvt, rve, 
RNase H, and gag_p30 domains, all of which are described above. 

RNase H-Related Sequences 

[0222] RNase H is a nuclease specific for the RNA strand of an RNA- 
DNA hybrid that cleaves phosphodiester bonds to produce molecules with 3 -OH and 
5 -P0 4 ends. Multiple forms of RNase H are present in both prokaryotes and 
eukaryotes. RNase H may be part of larger polypeptides and its activity can be 
influenced by other regions of these polypeptides (Coffin, et al., 1997; Crouch 1990). 

[0223] During retroviral replication, RNase H activity forms 
oligonucleotides that prime DNA synthesis. Therefore, the RNase H activity of 
reverse transcriptase is a target for therapeutic intervention. For example, small 
molecule inhibitors of retroviral RNase H function have shown promise in managing 
HIV infection (Klarman, et al., 2002). 
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[02 24] Another therapeutic indication for RNase H is the regulation of 
cancer genes by targeting mRNA translation. Antisense deoxyoligonucleotid.es down- 
regulate mRNA expressionby annealing to specific regions of an mRNA. Formation 
of the DNA.RNA heteroduplex then triggers mRNA cleavage by RNase H. Cleavage 
is rapidly followed by further degredation, irreversibly preventing translation of the 
target mRNA. Antisense deoxyoligonucleotides that trigger RNase H activity can 
thus be used as cancer therapeutic agents (Crooke, 1996; Curcio et aL, 1997). 

[0225] RNase H-related sequences can possess or interact with maseH, 
Gagjp30, rvt, and rve domains, all of which are described above. 

SH3-Related Sequences 

[0226] Src homology region 3 (SH3) is a polypeptide domain commonly 
found in intracellular signaling proteins; it binds with moderate affinity and selectivity 
to proline-rich ligands. SH3 domains are heterogeneous; different SH3 domains bmd 
to different proline-rich sequences (Gmeiner and Horita, 2001). SH3 domains are 
involved in a wide variety of biological processes, including mediating the assembly 
of large multiprotein complexes, regulating enzyme activity, and modulating the local 
concentration or subcellular localization of signaling pathway components (Mayer, 
9001) Examples of SH3-related sequences include phosphotyrosine receptors, 
membrane associated guanylate kinases, mitogen-activated protein kinases, myosin 1, 
the Crk adaptor protein, phospholipase C-v, Grb2, Sos, src-SH3, Abl-SH3, the Nek 

adaptor, and alpha-spectrin-SH3. 

[0227] SH3-related sequences can possess or interact with SH3 

domains, which are protein domains of approximately 50-70 amino acids, and are 
present in a large number of proteins involved in intracellular signaling (http://pfam. 
wustl edu/cgi-bin/getdesc?name=SH3). SH3-related sequences can also possess or 
interact with SH3 domain-binding protein 5 (SH3BP5) domains, which are protem 
domains that act as a substrate for c-Jun N-terminal kinase (http://pfam.wustl.edu/cgx- 
bin/getdesc?name=SH3BP5). 

Stem Cell-Related Sequences 

[0228] Stem cells are pluripotent or multipotent cells that generate maturing 
cells in multiple differentiation lineages. Pluripotent cells have the capacity to 
differentiate into each and every cell present in the organism. Embryonic stem cells 
are pluripotent; they can differentiate into any of the cells present in the adult. 
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Multipotent cells have the ability to differentiate into more than one cell type. Organ- 
specific stem cells are multipotent; they can differentiate into any of the cells of the 
organ they inhabit. 

[0229] When they divide in vivo, both pluripotent and multipotent stem cells 
can maintain their pluripotency or multipotency while giving rise to differentiated 
progeny. Thus, stem cells can produce replicas of themselves which are pluri- or 
multipotent, and are also able to differentiate into lineage-restricted committed 
progenitor cells. For example, hematopoeitic stem cells, which are multipotent cells 
specifically able to form blood cells, can divide to produce replicate hematopoeitic 
stem cells. They can also divide to produce more highly differentiated cells, which 
are precursors of blood cells. The precursors differentiate, sometimes through several 
generations of cells, into blood cells. A hematopoetic stem cell can also divide into a 
cell with the capacity to form, for example, a relatively undifferentiated cell that is 
committed to differentiate into, i.e., granulocytes, or erythrocytes, or another type of 
blood cell. 

[0230] Stem cells can also reproduce and differentiate in vifro. Embryonic 
stem cells have been directed to differentiate into cardiac muscle cells in vitro and, 
alternatively, into early progenitors of neural stem cells, and then into mature neurons 
and glial cells in vitro (Trounson, 2002). 

[023 1] Stem cell therapy is effective in treating cancer in humans (Slavin et 
al., 2001), and offers several advantages over traditional cancer therapies (Weissman, 
2000). One advantage of stem cell therapy exists when used in conjunction with 
radiation therapy. In radiation therapy for cancer, the dose of radiation necessary to 
kill the cancer cells in an organ can also be sufficient to destroy the healthy cells of 
the organ. In combined stem cell and radiation therapy, an organ is first treated with 
sufficient radiation to destroy all of the cancer cells and most or all of the healthy 
cells, but then stem cells are infused to repopulate the organ. In the ensuing weeks, as 
the cancer cells and healthy cells die, the stem cells replace the healthy cells. Another 
advantage of this approach, compared to heterologous organ transplants, is that there 
is no risk of rejection, since stem cells do not provoke an immune response. A further 
advantage is that stem cells are inherently programmed to regulate their numbers and 
differentiation status, i.e., once provided to the patient, the necessary number will 
differentiate, and the rest will remain undifferentiated (Weissman, 2000). 
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[0232] Stem cell therapy is also effective in treating autoimmune disease in 
humans. For example, immunosuppression in conjunction with stem-cell 
transplantation has induced remission in patients with refractory, severe rheumatic 
autoimmune disease (Van Laar and Tyndall, 2003). Patients with rheumatoid 
arthritis, systemic lupus erythematosus, systemic sclerosis, and juvenile idiopathic 
arthritis have benefited from stem cell transplants (Van Laar and Tyndall, 2003). 

[0233] Preclinical studies also suggest the potential of stem cell 
transplantation for the treatment of neural and muscular injuries and disorders, 
including those of the central nervous system, peripheral nervous system, and skeletal, 
cardiac and smooth muscle (Deasy and Huard, 2002). Stem cells transplanted into the 
bone marrow of mice migrate to the site of injured muscle and differentiate into new 
muscle cells. For example, patients with myasthenia gravis, muscular dystrophies, 
amyotrophic lateral sclerosis, congestive heart failure, Parkinson's disease, and 
Alzheimer's disease may benefit from stem cell therapy (Henningson, 2003). 

[0234] In addition to therapeutic uses, research using stem cells can provide 
useful information about normal stem cell function and the pathogenesis of disease. 
Stem cells derived from a patient with a genetic disease can provide a tool for 
studying that disease. To derive these stem cells, a somatic cell, i.e., a cell that is not 
in the oocyte or spermatocyte lineage, is donated by the patient, and the nucleus is 
removed and transferred to an unfertilized human oocyte. This nuclear transplant 
procedure produces, at the blastocyst stage of development, embryonic stem cells 
with the same set of genes as the patient with the genetic disease. Studying these 
cells, and their progeny in vitro, permits analysis of a specific model of the disease. 
For example, placing stem cells derived from a patient with a genetic disorder under 
the control of various stem cell regulatory factors can elicit abnormal responses from 
the affected stem cells compared to stem cells derived from a healthy individual's 
somatic nucleus. 

[0235] Embryonic stem cell-related sequences can possess or interact with 
the stem cell factor (SCF) domain, a transmembrane domain having a soluble, 
secreted form, which is involved in hematopoeisis, and which binds to and activates a 
receptor tyrosine kinase, stimulating the proliferation of mast cells and augmenting 
the proliferation of myeloid and lymphoid hematopoietic progenitors in bone marrow 
cmture(http://pfam.wustl.edu/cgi-bin/getdesc?name=SCF). 
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[0236] Certain stem cell related sequences can possess the ability to maintain 
the stem cell in undifferentiated state while allowing cell proliferation. Such 
compositions can be useful in ex vivo cell therapy to expand populations of cells for 
cell replacement therapy. 

[0237] Certain stem cell related sequences can possess the ability to cause 
cell differentiation to a relatively mature cell type and are useftd to in vivo or ex vivo 
therapy to compensate for deficiency of such relatively mature cell type. 

Synthetase-Related Sequences 

[0238] A synthetase is an enzyme that catalyzes the synthesis of a 

molecule. Synthetases comprise a broad class of enzymes; they catalyze the synthesis 
of nucleic acids, peptides, and lipids (Agou et al., 1996). Examples of synthetases 
include lysyl-tRNA synthetase, asparaginyl t-RNA synthetase, holocarboxylase 
synthetase, carbamyl phosphate synthetase I, and argininosuccinate synthetase. 

[0239] Synthetase-related sequences can possess or interact with transfer 
RNA synthetase domains, which. are protein domains that activate amino acids and 
transfer them to specific transfer RNA molecules as a step in protein biosynthesis 
(http://pfam.wustl.edu/cgi-bin/getdesc?name=tRNA-synt_2). The 20 aminoacyl- 
tRNA synthetases are divided into class I and class II, each of which contain multiple 
synthetases with different specificities. For example, there is a protein domain 
involved in the asparagines, aspartic acid, and lysine synthesis (http://pfam.wustl. 
edu/cgi-bin/textsearch?terms=trna-synt&search_what==all& sections= 
DE&sections=CC&size=100). Synthetase-related sequences can also possess or 
interact with lipid-A-disaccharide synthetase (LpxB) domains, which are protein 
domains that catalyze the synthesis of disaccharides (http://pfam.wustl.edu/cgi- 
bin/getdesc? name=LpxB). 

TATA Box-Related Sequences 

[0240] A TATA box is a consensus sequence in the promoter region of 

many eucaryotic genes that binds a general transcription factor and plays a role in 
specifying the position for transcription initiation. TATA boxes are generally found 
approximately 25 nucleotides before the site of transcription initiation (Chalut et al., 
1995). Examples of TATA box-related sequences include TATA box binding 
protein, 13 TATA/TBP, and small nuclear RNA-activating protein 190 Myb DNA. 

[0241] TATA box-related sequences can possess or interact with 
transcription factor TFIID, also known as the TATA-binding protein (TBP) domain, 
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which is a protein domain that specifically binds to the TATA box promoter element 
(http://pfam.wustl.edu/cgi-bin/getdesc?name=TBP). TATA box-related sequences 
can also possess or interact with HMG14 and HMG17 (HMG14_17) domains, which 
are members of a family of high mobility group proteins, described above 
(http://pfam.wustl. edu/cgi-bin/getdesc? name=HMG14_17). 

Tat-Related Sequences 

[0242] Tat is a human immunodeficiency virus (HIV) protein involved 

in viral production of new RNA genomes and new complete viral particles. Tat is 
also involved in AIDS pathogenesis; it plays a role in reactivating latent viruses, e.g., 
the JC retrovirus; it is involved in the development of AIDS-related Kaposi's 
Sarcoma; and it depresses the function of, and induces apoptosis in, helper CD4 cells 
(Yu et al., 1995). Examples of Tat-related sequences include Tat-associated proteins, 
e.g., Tap, HIV-1 Rev, and tat-associated kinase (also known as positive transcriptional 

elongation factor b). 

[0243] Tat-related sequences can possess or interact with 

transactivating regulatory protein (Tat) domains, which are protein domains that 
contribute to efficient transcription of a viral genome (http://pfam.wustl.edu/cgi- 
bin/getdesc?name=Tat). Tat-related sequences can also possess or interact with 
mitochondrial glycoprotein (MAM33) domains, which are protein domains found in 
mitochondrial matrix proteins, and which can be involved in mitochondrial oxidative 
phosphorylation and in interactions between the nucleus and the mitochondria 
(http://pfam.wustl.edu/cgi-bin/getdesc?name=MAM33). 
Transferase-Related Sequences 

[0244] Transferases are enzymes that transfer a designated group of 

atoms from a donor molecule to an acceptor molecule. For example, acyl transferases 
transfer acyl groups, methyl transferases transfer methyl groups, nucleotidyl 
transferases transfer nucleotides, prenyltransferases transfer prenyl groups, and 
glycosyl transferases transfer glycosyl groups (Lin et al., 1996). Examples of 
transferases include acetyltransferases, hydroxymethyltransferases, sialyltransferases, 
arginine N-methyltransferase, glucoronosyltransferase, NTP-transferase, and GDP- 

mannose pyrophosphorylase B. 

[0245] Transferase-related sequences possess or interact with UDP- 
glucuronosyl and UDP-glucosyl transferase domains, which are protein domains 
found in a superfamily of enzymes that catalyze the addition of the glycosyl group 
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from a UTP-sugar to a small hydrophobic molecule (http://pfam.wustl.edu/cgi- 
bin/getdesc?name=UDPGT). Transferase-related sequences also possess or interact 
with nucleotide transferase (NTP_transferase) domains, which are protein domains 
that transfer nucleotides onto phosphorylated sugars (http://pfam.wustl.edu/cgi- 
bin/getdesc?name=NTP_transferase). 

Transposase-Related Sequences 

[0246] Transposases are site-specific recombination enzymes that 

catalyze the transposition of a segment of DNA from one part of the genome to 
another. The movable segments are called transposable elements; each transposable 
element is occasionally moved by a transposase, which functions as an integrase, by 
inserting DNA sequences into other DNA sequences. Transposases are often encoded 
by the DNA of the transposable element itself. Transposases bind specifically to 
terminal inverted repeats of 10-500 bp that are characteristically part of transposable 
elements (Smit and Riggs, 1996). They catalyze both cutting and pasting of a 
transposable element from one segment of the genome to another. Sequences related 
to transposases can have other functions, e.g., as transcription factors, or in the 
assembly of centromere proteins (Smit and Riggs, 1996). Examples of transposase- 
related sequences include mariner, pogo, hobo, tigger, MER37, Galileo, Ocean, 
Impala, Tn MERI1, MsqTc3, and the sleeping beauty transposon system (Robertson 
and Zumpano, 1997; Robertson, 1996; Smit and Riggs, 1996). 

[0247] Transposase-related sequences can possess or interact with a 
transposase 1 (Transposase_l) domain, which is characterized by sequences that can 
excise and/or insert mobile genetic elements such as transposons or insertion 
sequences; for example, mariner possesses a transposase 1 domain 
(http://pfam.wustl.edu/cgi-bin/getdesc? name= Transposase_l). Transposase-related 
sequences can also possess or interact with LI transposable element (Transposase_22) 
domains, which have been described above. Transposase-related sequences can also 
possess or interact with a DDE endonuclease (DDE) domain, which is responsible for 
coordinating metal ions needed for endonuclease catalytic activity (http://pfam.wustl. 
edu/cgi-bin/getdesc? name=DDE). Transposase-related sequences can additionally 
possess or interact with a zinc finger, C2H2 type (zf-C2H2) domain, which bind 
nucleic acids using a mechanism that involves coordinating a zinc atom with a pair of 
cysteine residues and a pair of histidine residues (http://pfam.wustl.edu/cgi- 
bin/getdesc?name=zf-C2H2). Transposase-related sequences can also possess or 
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interact with a reverse transcriptase (rvt) domain, and/or a low-density lipoprotein 
receptor (ldl_rece) domain, both of which are described above. 
Ubiquitin-Related Sequences 

[0248] Ubiquitin is a protein found in all eucaryotic cells examined to 
date. When it is linked to the lysine side chain of a protein by the formation of an 
amide bond with its C-terminal glycine, ubiquitin renders the ubiquitin-bound protein 
subject to rapid proteolysis in the proteasome. In addition to its role in the selectxve 
degradation of cellular proteins, ubiquitin also plays a role in maintaining 
chromosome structure, regulating gene expression, responding to stresses on the 
organism, the regulation of gene . expression, and ribosome biogenesis. Examples of 
ubiquitin-related sequences include elongins, ubiquitin-specific proteases, ubiqumn- 
calmodulin ligase, ubiquitin carrier protein kinase, ubiquitin N-alpha-protein 
hydrolase, and the small ubiquitin-related modifier (Sumo-1) (Kamitani et al., 1997). 

[0249] . Ubiquitin-related sequences can possess or interact with a 
ubiquitin domain, which is a conserved sequence of approximately 76 amino acid 
residues that comprise the protein ubiquitin (http://pfam.wustl.edu/cgi- 
bin/getdesc?name=ubiquitin). Ubiquitin-related sequences can also possess or 
interact a ubiquitin carboxyl-terminal hydrolase (UCH) domain, which is a protem 
domain that comprises a thiol protease that recognizes and hydropses the peptide 
bond at the C-terminal glycine of ubiquitin (http://pfam.wustl.edu/cgi-bin/get 

desc?name=UCH). 

Virus-Related Sequences 

[0250] The human chromosome has integrated endogenous genes that 
are related to viral genes. Some endogenous viral genes, e.g., the retroviral HERV-W 
family are widely and heterogeneously dispersed among human chromosomes 
(Voisset et al., 2000; Everett et al., 1997; Werner et al., 1990). Endogenous 
provinces are usually transcriptionally silent, but are expressed under certain 
conditions (Coffin et al., 1997). Endogenous viral expression can be specific to host 
factors, such as cell type or stage of differentiation, as well as other factors includmg 

the position on the chromosome, the influence of as-acting sequences, or the presence 

of host-mediated DNA methylation (Coffin). 

[025 1] Endogenous viral expression can have a number of 

consequences, both beneficial and detrimental. Among the beneficial consequences is 

the ability of endogenous retroviruses to confer resistance to infection by exogenous 
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viruses. For example, mice with endogenous mouse mammary tumor virus (MMTV) 
can be immune to exogenous infection (Golovkina, et al., 1992). Among the 
detrimental effects is a causative role in disease. Evidence indicates an association 
between endogenous viruses with cancers and autoimmune diseases (Coffin et al., 
1997). For example, spontaneous tumors of specific origin, murine mammary 
adenocarcinomas, and murine T-cell lymphomas have been associated with the 
presence of specific endogenous retroviruses. Furthermore, a transformed phenotype 
is associated with the increased transcription of certain classes of endogenous viral 
elements (Coffin et al., 1997). With respect to autoimmune disease, an endogenous 
virus that influences the immunoregulatory process has been associated with 
spontaneous autoimmune thyroiditis in a chicken model of human Hashimoto disease 
(Wick et al., 1987). Examples of viral-related proteins include hepatitis B virus x- 
interacting protein, herpesvirus associated ubiquitin-specific protease, and 
Coxsackievirus and adenovirus receptor precursor. 

[0252] Viral-related sequences can possess or interact with rvt, rve, and 
gag_p30 sequences, all of which are described above. 

Zinc Finger-Related Sequences 

[0253] A zinc finger domain is a small, self-folding, structural motif of 25 to 
30 amino-acid residues present in many nucleic acid-binding proteins. It is comprised 
of a polypeptide loop held in a hairpin bend and bound to a zinc atom, and includes 
two conserved cysteine and two conserved histidine residues. Many classes of zinc 
fingers have been characterized according to the number and positions of the 
conserved histidine and cysteine residues. The amino acid configuration that holds 
the zinc atom in a tetrahedral array has a finger-like projection that interacts with 
nucleotides in the major groove of the bound nucleic acid. Zinc finger motifs have 
conserved regions near the zinc molecule, and variable regions at the nucleic acid 
binding site that provide specificity for the nucleic acid sequences they bind. Zinc 
finger proteins have a variety of functions, including as transcription regulators and 
intracellular receptors. Zinc finger domains are also involved in protein-protein 
interactions, e.g., those involving protein kinase C. Recently, zinc finger nucleases 
have been used to target genes for gene replacement by homologous recombination 
(Bibikova et al., 2003). Examples of zinc finger proteins include XC3H-3b, the 
transcription factor Slug, and transcription factor IIIA. 
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[02543 Zinc finger-related sequences can possess or interact with a zinc 
finger C2H2 type (zf-C2H2) domain, which binds a zinc atom with two cysteine and 
two histidine-residues, and is utilized, e.g., in RNA transcription (http://pfam.wustl. 
e du/cgi-bin/getdesc?name=zf-C2H2). Zinc finger-related sequences can also possess 
or interact with a C3HC4 type, RING finger (zf-C3HC4) domain, which is a 
specialized type of zinc finger domain comprised of 40 to 60 amino acids that binds 
two zinc atoms; variants of RING-finger domains include the C3HC4-type and the 
C3H2C3-type (http://pfam.wustl.edu/cgi-bin/getdesc?name=zf-C3HC4). Proteins 
with RING-finger domains have developmental and functional roles; they are 
involved in intracellular receptor binding, and in mediating protein-protein 
interactions (Gray et al., 2000). RING-finger domains can exhibit ubiquitin-protein. 
ligase activity, and can bind to E2 ubiquitin-conjugating enzymes. 

[0255] Zinc finger-related sequences can also possess or interact with a zinc 
knuckle (zf-CCHC) domain, which is an 18-amino acid zinc finger domain found in 
RNA-binding and single strand DNA-binding proteins; they are often involved in 
eukaryotic gene regulation (http://pfam.wusti.edu/cgi-bin/getdesc?name==zf-CCHC). 
Zinc knuckles are also found in retroviral gag and nucleocapsid proteins, where they 
function in genome packaging, and early in the infection process. Zinc finger-related 
sequences can also possess or interact with a BTB/POZ (BTB) domain, which 
mediates both homomeric and heteromeric protein dimerization (htt P ://pfam.wustl. 
edu/cgi-bin/getdesc?name=BTB). Zinc finger-related sequences can also possess or 
interact with NF-X1 type zinc finger (zf-NF-Xl) domains, which are found in the 
transcriptional repressor NK-X1, where they repress transcription of HLA-DRA, and 
in the shuttle craft protein, which plays a role in late stage embryonic neurogenesis 
(http //pfam.wustl.edu/cgi-bin/getdesc?name=zf-NF-Xl). Zinc finger-related 
sequences can also possess or interact with a KRAB box (KRAB) domain, also 
known as a Kruppel-associated box, which is comprised of approximately 75 ammo 
acids, enriched in charged amino acids, and involved in protein-protein interactions 
(http://pfam:wustl.edu/cgi-bin/getdesc? name=KRAB). KRAB domains can function 
as transcription factors, e.g., as a transcriptional repressor, and can assume roles in 
cell differentiation and development (Aubry et al., 1992; Lovering and Trowsdale, 
1991). Zinc finger-related sequences can possess or interact with a transposase_22 
domain, which is described above. 
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Industrial Applicability 
[0256] The invention provides sequences related to secreted sequences, 
single-transmembrane sequences, multiple-transmembrane sequences, kinase-related 
sequences, ligase-related sequences, nuclear hormone receptor-related sequences, 
phosphatase-related sequences, protease-related sequences, phosphodiesterase-related 
sequences, kinesin-related sequences, immunoglobulin-related sequences, T-cell 
receptor-related sequences, glycosylphosphatidylinositol anchor-related sequences, 
and sequences related to other nucleic acid and amino acid sequences of the invention, 
including activators, adaptors, adhesion molecules, ATPases, ATP, breakpoints, 
channels, checkpoints, complexes, dehydrogenases, disintegriris, endopeptidases, 
germ-cells, GTPases, helicases, hydrolases, integrases, integrins, isomerases, 
membranes, mucins, oxygenases, peroxidases, phopholipases, prosaposins, 
proteosomes, reductases, reverse trancriptases, RNases, RNases H, SH3, synthetases, 
TATA boxes, Tat proteins, transferases, transposases, ubiquitins, and viruses. The 
invention provides for novel polynucleotides, related novel polypeptides and active 
fragments thereof, as well as novel nucleic acid compositions encoding these 
polypeptides, compositions comprising the related polypeptides, and methods for their 
use. 

[0257] The present invention also provides for vectors, host cells, and 
methods for producing the polynucleotides and polypeptides of the invention in these 
vectors and host cells. The present invention further provides for antisense molecules 
that are capable of regulating the expression of the polynucleotides or polypeptides 
herein. In addition, modulators, including antibodies that bind specifically to the 
polypeptides or modulate the activity of the polypeptides, are also provided. 

[0258] The present polynucleotides, polypeptides, and modulators find 
use in therapeutic agent screening/discovery applications, such as screening for 
receptors or competitive ligands, for use, for example, as small molecule therapeutic 
drugs. Also provided are methods of modulating a biological activity of a polypeptide 
and methods of treating associated disease conditions, particularly by administering 
modulators of the present polypeptides, such as small molecule modulators, antisense 
molecules, and specific antibodies. 

[0259] The present polypeptides, polynucleotides, and modulators find 
use in a number of diagnostic, prophylactic, and therapeutic applications. The 
polynucleotides and polypeptides of the invention can be detected by methods 
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provided herein; these methods are useful in diagnosis, and can be accomplished by 
the use of diagnostic kits. The polynucleotides and polypeptides of the invention are 
useful for treating a variety of disorders, including cancer, proliferative disorders, 
inflammatory disorders, immune disorders, viral disorders, bacterial disorders, and 
metabolic disorders. For example, subjects who suffer from a deficiency, or a lack of 
a particular protein, or are otherwise in need of such protein to repair or enhance a 
desirable function, benefit from the administration of aprotein or an active fragment 
thereof by any conventional routes of administration. These include therapeutic 
vaccines in the form of nucleic acid or polypeptide vaccines, such as cancer vaccines, 
where the vaccines can be administered alone, such as naked DNA, or can be 
facilitated, such as via viral vectors, microsomes, or liposomes. Therapeutics 
antibodies include those that are administered alone or in combination with cytotoxic 
agents, such as radioactive or chemotherapeutic agents. 

[0260] . In particular, the polypeptides, polynucleotides, and modulators 
of the present invention can be used to treat cancers, including, but not limited to, 
cancers of the prostate, breast, bone, soft tissue, liver, kidney, ovary, cervix, skin, 
pancreas, and brain, as well as leukemias, lymphomas, lung cancers such as 
adenocarcinomas and squamous cell carcinoma, and cancers of gastrointestinal organs 
such as stomach, colon, and rectum. Further, the polypeptides, polynucleotides, and 
modulators of the present invention can be used to treat inflammatory, immune, viral, 
bacterial, and metabolic diseases, disorders, syndromes, or conditions, including, but 
not limited to, intestinal inflammation and immunity, autoimmune thyroiditis, and 
retroviral infections, as well as tissue and/or organ hypertrophy. 

DISCLOSURE OF THE INVENTION 
[0261] The present invention features an isolated polynucleotide that 
encodes a polypeptide. In some embodiments, the polypeptide has at least about 70%, 
at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least 
about 95%, at least about 97%, at least about 98%, or at least about 99% amino acid 
sequence identity with an amino acid sequence derived from a polynucleotide 
sequence chosen from at least one nucleotide sequence according to SEQ ID NOS.: 1- 
104. In some embodiments, the polypeptide has an amino acid sequence chosen from 
at least one amino acid sequence encoded by SEQ ID NOS.: 1-104. In many 
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embodiments, the polypeptide has at least one activity associated with the naturally 
occurring encoded polypeptide. 

[0262] In some embodiments, the polypeptide includes a signal peptide. In 
alternative embodiments, the polypeptide comprises a mature form of a protein, from 
which the signal peptide has been cleaved. In other embodiments, the polypeptide is a 
signal peptide. In a further aspect, the invention provides fragments of a polypeptide 
chosen from at least one amino acid sequence encoded by SEQ ID NO.: 1-104, 
where each fragment is an extracellular fragment of the polypeptide, or an 
extracellular fragment of the polypeptide minus the signal peptide. The invention 
provides an N-terminal fragment containing a Pfam domain, and a C-teiminal 
fragment containing a Pfam domain and either or both may be biologically active. 

[0263] In yet other embodiments, the polypeptides function as secreted 
proteins. In yet further embodiments, the polypeptides function as single- 
transmembrane proteins. In yet further embodiments, the polypeptides function as 
multiple-transmembrane proteins. In yet further embodiments, the polypeptides 
function as kinases. In yet further embodiments, the polypeptides function as protein 
kinases. In yet further embodiments, the polypeptides function as ligases. In yet 
further embodiments, the polypeptides function as nuclear hormone receptors. In yet 
further embodiments, the polypeptides function as phosphatases. In yet further 
embodiments, the polypeptides function as proteases. In j'et further embodiments, the 
polypeptides function as phosphodiesterases. In yet further embodiments, the 
polypeptides function as kinesins. In yet further embodiments, the polypeptides 
function as immunoglobulins. In yet further embodiments, the polypeptides function 
as T-cell receptors. In yet further embodiments, the polypeptides function as 
glycosylphosphatidylinositol anchors. 

[0264] In yet further embodiments, the polypeptides function as cytokines. 
In still further embodiments, the polypeptides function as immune cells. In further 
embodiments^ the polypeptides function as antigens. In yet further embodiments, the 
polypeptides function as receptors. In other embodiments, the polypeptides function 
as binding proteins. In other embodiments, the polypeptides function as factors. In 
further embodiments, the polypeptides function as growth factors. In further 
embodiments, the polypeptides function as heat-shock proteins. In some 
embodiments, the polypeptides function as membrane transport proteins. In yet 
further embodiments, the polypeptides function as ribosomal proteins. In some 
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embodiments, the polypeptides function as zinc fingers. In some embodiments, the 
polypeptides function as embryonic stem cell-related peptides. In still further 
embodiments, the polypeptides function in pathological states. In other embodiments, 
the polypeptides function as one or more of these. 

[0265] In yet further embodiments, the polypeptides function as activators. 
In yet further embodiments, the polypeptides function as adaptors. In yet further 



embodiments, the polypeptides function as adhesion molecules. In yet further 
embodiments! the polypeptides function as ATPases. In yet further embodiments, the 
polypeptides function, as ATP-related polypeptides. In further embodiments, the 
polypeptides function as channel-related polypeptides. In yet further embodiments, 
the polypeptides function as checkpoint-related polypeptides. In yet further 
embodiments, tire polypeptides function as complexes. In yet further embodiments, 
the polypeptides function as dehydrogenases. In yet further embodiments, the 
polypeptides function as disintegrins. In yet further embodiments, the polypeptides 
function as endopeptidases. In yet further embodiments, the polypeptides function as 
germ-cells. In yet further embodiments, the polypeptides function as GTPases. In yet 
further embodiments, the polypeptides function as helicases. In yet further 
embodiments, the polypeptides function as hydrolases. In yet further embodiments, 
the polypeptides function as integrases. In yet further embodiments, the polypeptides 
function as integrins. In yet further embodiments, the polypeptides function as 
isomerases. In yet further embodiments, the polypeptides function as membranes. In 
yet further embodiments, the polypeptides function as mucins. In yet further 
embodiments, the polypeptides function as oxygenases. In yet further embodiments, 
the polypeptides function as peroxidases. In some embodiments, the polypeptides 
function as phospholipases. In yet further embodiments, the polypeptides function as 
prosaposins. In yet further embodiments, the polypeptides function as proteasomes. 
In yet further embodiments, the polypeptides function as reductases. In other 
embodiments, the polypeptides function as reverse transcriptase-related polypeptides. 
In yet further embodiments, the polypeptides function as RNases. In further 
embodiments, the polypeptides function as RNase H-related polypeptides. In yet 
further embodiments, the polypeptides function as SH3-related polypeptides. In yet 
further embodiments, the polypeptides function as synthetases. In yet further 
embodiments, the polypeptides function as TATA box-related polypeptides. In yet 
further embodiments, the polypeptides function as TAT-related polypeptides. In yet 
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further embodiments, the polypeptides function as transferases. In yet further 
embodiments, the polypeptides function as transposases. In yet further embodiments, 
the polypeptides function as ubiquitin-related polypeptides. In yet further 
embodiments, the polypeptides function as virus-related polypeptides. In other 
embodiments, me polypeptides function as one or more of these. 

[0266] The present invention features an isolated polynucleotide that 
hybridizes under stringent hybridization conditions to a coding region of at least one 
nucleotide sequence shown in SEQ ID NOS.: 1 - 104, or a complement thereof. 

[0267] The present invention features an isolated polynucleotide that shares 
at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least 
about 90%, at least about 95%, at least about 97%, at least about 98%, at least about 
99% nucleotide sequence identity with a nucleotide sequence of the coding region of 
at least one sequence shown in SEQ ID NOS.: 1 - 104, or a complement thereof. In 
some embodiments, a subject polynucleotide has the nucleotide sequence shown in at 
least one of SEQ ID NOS. : 1 - 1 04, or a coding region thereof. 

[0268] The present invention also features a vector, e.g., a recombinant 
vector, that includes a subject polynucleotide, and a promoter the drives its 
expression. This vector can transform a host cell, and the present invention further 
features such host cells, e.g., isolated in vitro host cells, and in vivo host cells, that 
comprise a polynucleotide of the invention, or a recombinant vector of die invention. 

[0269] The present invention further features a library of polynucleotides, 
wherein at least one of the polynucleotides comprises the sequence information of a 
polynucleotide of the invention. In specific embodiments, the library is provided on a 
nucleic acid array. In some embodiments, the library is provided in computer- 
readable format. 

[0270] The present invention features a pair of isolated nucleic acid 
molecules, each from about 10 to about 200 nucleotides in length. The first nucleic 
acid molecule of the pair comprises a sequence of at least 10 contiguous nucleotides 
having 100% sequence identity to at least one nucleic acid sequence shown in SEQ ID 
NOS.: 1-104. The second nucleic acid molecule of the pair comprises a sequence of 
at least 10 contiguous nucleotides having 100% sequence identity to the reverse 
complement of at least one nucleic acid sequence shown in SEQ ID NOS.: 1 - 1 04. 
The sequence of said second nucleic acid molecule is located 3 'of the nucleic acid 
sequence of the first nucleic acid molecule shown in SEQ ID NOS.: 1 - 104. The pair 
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of isolated nucleic acid molecules are useful in a polymerase chain reaction or in any 
other method known in the art to amplify a nucleic acid that has sequence identity to 
the sequences shown in SEQ ID NOS, 1-104, particularly when cDNA is used as a 
template. 

[0271] Themventionfeamresamemodofdeterminingthepresenceofa 
polynucleotide substantially identical to a polynucleotide sequence shown in the 
Sequence Listing, or a complement of such a nucleotide by providing its complement, 
allowing the polynucleotides to interact, and determining whether such interaction has 
occurred. 

[0272] The invention further features methods of regulating the expression 
ofthesubjectpolynucleotidesandencodedpolypeptides. The invention provides a 
method of inhibiting transcription or translation of a first polynucleotide encoding a 
first polypeptide of the invention by providing a second polynucleotide that 
hybridizes to the first polynucleotide, and allowing the first polynucleotide to contact 
and bind to the second polynucleotide. The second polynucleotide can be chosen 
from an antisense molecule, a ribozyme, and an interfering RNA (RN Ai) molecule. 

[0273] The present invention further features an isolated polypeptide, e.g., an 
isolated polypeptide encoded by a polynucleotide, and biologically active fragments 
of such polypeptide. In some embodiments, the polypeptide is a fusion protein. In 
some embodiments, the polypeptide has one or more amino acid substitutions, and/or 
insertions and/or deletions, compared with at least one sequence shown in SEQ ID 
NOS.: 1-104. In some embodiments, the polypeptide has an amino acid sequence 
derived from at least one nucleotide sequence shown in SEQ ID NOS.: 1-104. 

[0274] The invention also provides a method of making a polypeptide of the 
invention by providing a nucleic acid molecule that comprises a polynucleotide 
sequence encoding a polypeptide of the invention, introducing the nucleic acid 
molecule into an expression system, and allowing the polypeptide to be produced. 

[0275] In some embodiments, the method involves in vitro cell-free 
transcription and/or translation. For example, the expression system can comprise a 
cell-free expression system, such as an E. coli system, a wheat germ extract system, a 
rabbit reticulocyte system, or a frog oocyte system. 

[0276] In certain other embodiments, the expression system can comprise a 
prokaryotic or eukaryotic cell, for example, a bacterial cell expression system, a 
fhngal cell expression system, such as yeast or Aspergillus, a plant cell expression 
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system, e.g., a cereal plant, a tobacco plant, a tomato plant, or other edible plant, an 
insect cell expression system, such as SF9 of High Five cells, an amphibian cell 
expression system, a reptile cell expression system, a crustacean cell expression 
system, an avian cell expression system, a fish cell expression system, or a 
mammalian cell expression system, such as one using Chinese Hamster Ovary (CHO) 
cells. In some embodiments, the method involves culturing a subject host cell under 
conditions such that the subject polypeptide is produced by the host cells; and 
recovering the subject polypeptide from the culture, e.g., from within the host cells, or 
from the culture medium. In further embodiments, the polypeptide can be produced 
in vivo in a multicellular animal or plant, comprising a polynucleotide encoding the 
subject polypeptide. 

[0277] The present invention further features a non-human animal injected 
with at least one polynucleotide comprising at least one nucleotide sequence chosen 
from SEQ ID NOS.: 1-104, and/or at least one polypeptide comprising at least one 
amino acid sequence encoded by SEQ ID NOS.: 1-104. 

[0278] The present invention further features an antibody that specifically 
recognizes, binds to, interferes with, or modulates the biological activity of a subject 
polypeptide or a fragment thereof. The polypeptide can be a single-transmembrane 
protein, multiple-transmembrane protein, kinase, protein kinase, ligase, nuclear 
hormone receptor, phosphatase, protease, phosphodiesterase^ kinesin, 
immunoglobulin, T-cell receptor, glycosylphosphatidylinositol anchor, or other 
nucleic acid and amino acid sequences, including, activators, adaptors, adhesion 
molecules, ATPases, ATP, breakpoints, channels, checkpoints, complexes, 
dehydrogenases, disintegrins, endopeptidases, germ-cells, GTPases, helicases, 
hydrolases, integrases, integrins, isomerases, membranes, mucins, oxygenases, 
peroxidases, phospholipases, prosaposins, proteasomes, reductases, reverse 
transcriptases, RNases, RNases H, SH3, synthetases, TATA boxes, Tat, transferases, 
transposases, ubiquitins, and viruses. The fragment can be an extracellular fragment 
of a subject polypeptide, or an extracellular fragment of a subject polypeptide minus 
the signal peptide. 

[0279] The present invention further features an antibody that specifically 
inhibits binding of a polypeptide to its ligand or substrate. It also features an antibody 
that specifically inhibits binding of a polypeptide as a substrate to another molecule. 
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[0280] Another aspect of the present invention features a library of 
antibodies or fragments thereof, wherein at least one antibody or fragment thereof 
specifically binds to at least a portion of a polypeptide comprising an amino acid 
sequence encoded by SEQ ID NOS.: 1 - 104, and/or wherein at least one antibody or 
fragment thereof interferes with at least one activity of such polypeptide or fragment 
thereof. In certain embodiments, the antibody library comprises at least one antibody 
or fragment thereof that specifically inhibits binding of a subject polypeptide to its 
ligand or substrate, or that specifically inhibits binding of a subject polypeptide as a 
substrate to another molecule. The present invention also features corresponding 
polynucleotide libraries comprising at least one polynucleotide sequence that encodes 
an antibody or antibody fragment of the invention. In specific embodiments,.the 
library is provided on a nucleic acid array or in computer-readable format. 

[0281] An antibody of the present invention may comprise a monoclonal 
antibody, polyclonal antibody, single chain antibody, intrabody, and active fragments 
of any of these. The active fragments include variable regions from either heavy 
chains or light chains. The antibody can comprise the backbone of a molecule with an 
immunoglobulin domain, e.g., a fibronectin backbone, a T-cell receptor backbone, or 

a CTLA4 backbone. 

[0282] The present invention further features a targeting antibody, a 

neutralizing antibody, a stabilizing antibody, an enhancing antibody, an antibody 
agonist, an antibody antagonist, an antibody that promotes cellular endocytosis of a 
target antigen, a cytotoxic antibody, and an antibody that mediates antibody 
dependent cellular cytotoxicity (ADCC). The antibody that mediates ADCC can have 
a cytotoxic component, e.g., a radioisotope, a radioactive molecule, a microbial toxin, 
a plant toxin, a chemotherapeutic.agent, or a chemical substance, such as doxorubicin 
or cisplatin. The invention also features an inhibitory antibody, functioning to 
specifically inhibit the binding of a cognate polypeptide to its ligand or its substrate, 
or to specifically inhibit the binding of a cognate peptide as the substrate of another 
molecule. 

[0283] The antibodies of the present invention also encompass a human 
antibody, a non-human primate antibody, a monkey antibody, a non-primate animal 
antibody, e.g., a rodent antibody, rat antibody, a mouse antibody, a hamster antibody, 
a guinea pig antibody, a chicken antibody, a cattle antibody, a sheep antibody, a goat 
antibody, a horse antibody, porcine antibody, a cow antibody, a rabbit antibody, a cat 
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antibody, or a dog antibody. It also features a humanized antibody, a primatized 
antibody, and a chimeric antibody. 

[0284] The antibodies of the invention can be produced in vitro or in vivo. 
For example, the present invention features an antibody produced in a cell-free 
expression system, a prokaryote expression system or a eukaryote expression system, 
as described herein. 

[0285] The invention further provides a host cell that can produce an 
antibody of the invention or a fragment thereof. The antibody may also be secreted 
by the cell. The host cell can be a hybridoma, or a prokaryotic or eukaryotic cell. 
The invention also provides a bacteriophage or other virus particle comprising an 
antibody of the invention, or a fragment thereof. The bacteriophage or other virus 
particle may display the antibody or fragment thereof on its surface, and the 
bacteriophage itself may exist within a bacterial cell. The antibody may also 
comprise a fusion protein with a viral or bacteriophage protein. 

[0286] The invention further provides transgenic multicellular organisms, 
e.g., plants or non-human animals, as well as tissues or organs, comprising a 
polynucleotide sequence encoding a subject antibody or fragment thereof. The 
organism, tissues, or organs will generally comprise cells producing an antibody of 
the invention, or a fragment thereof. 

[0287] In another aspect, the present invention features a method of making 
an antibody by immunizing a host animal. In this method, a polypeptide or a 
fragment thereof, a polynucleotide encoding a polypeptide, or a polynucleotide 
encoding a fragment thereof, is introduced into an animal in a sufficient amount to 
elicit the generation of antibodies specific to the polypeptide or fragment thereof, and 
the resulting antibodies are recovered from the animal. The polypeptide can be 
encoded by a nucleic acid molecule comprising a nucleotide sequence chosen from at 
least one polynucleotide sequence according to SEQ ID NOS.: 1-104. 

[0288] The invention thus also provides a non-human animal comprising an 
antibody of the invention. The animal can be a non-human primate, (e.g., a monkey) 
a rodent (e.g., a rat, a mouse, a hamster, a guinea pig), a chicken, cattle (e.g., a sheep, 
a goat, a horse, a pig, a cow), a rabbit, a cat, or a dog. 

[0289] The present invention also features a method of making an antibody 
by isolating a spleen from an animal injected with a polypeptide or a fragment 
thereof, a polynucleotide encoding a polypeptide, or a polynucleotide encoding a 
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fragment thereof, and recovering antibodies from the spleen cells. Hybridomas can be 
made from the spleen cells, and hybridomas secreting specific antibodies can be 
selected. 

[0290] The present invention further features a method of making a 
polynucleotide library from spleen cells, and selecting a cDNA clone that produces 
specific antibodies, or fragments thereof.. The cDNA clone or a fragment thereof can 
be expressed in an expression system that allows production of the antibody or a 
fragment thereof, as provided herein. 

[0291] The invention also provides a method for determining the presence or 
measuring the level of a polypeptide that specifically binds to an antibody of the 
invention. This method involves allowing the antibody to interact with a sample, and 
determining whether interaction between the antibody and any polypeptide in the 
sample has occurred. Antibodies that specifically bind to at least one subject 
polypeptide are useful in diagnostic assays, e.g., to detect the presence of a subject 
polypeptide. Similarly, the invention features a method of determining the presence 
of an antibody to a polypeptide of the invention, by providing the polypeptide, 
allowing the antibody and the polypeptide to interact, and determining whether 

interaetion has occurred. 

[0292] The present invention further features a method of identifying an 
agent that modulates the level of a subject polypeptide (or an mRNA encoding a 
subject polypeptide) in a cell. The method generally involves contacting a cell (e.g., a 
eukaryotic cell) that produces the subject polypeptide with a test agent; and 
determining the effect, if any, of the test agent on the level of the polypeptide in the 
cell. 

[0293] The present invention further features a method of identifying an 
agent that modulates biological activity of a subject polypeptide. The methods 
generally involve contacting a subject polypeptide with a test agent; and determining 
the effect, if any, of the test agent on the activity of the polypeptide. In certain 
embodiments, the polypeptide is expressed on a cell surface. In certain embodiments, 
the agent or modulator is an antibody, for example, where an antibody binds to the 
polypeptide or affects its biological activity. 

[0294] The present invention further features biologically active agents (or 
modulators) identified using a method of the invention. 
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[0295] The present invention also features a method of modulating 
biological activity using an agent selectable by the above methods. Briefly, the 
method of modulating biological activity comprises contacting the agent with a first 
human or a non-human host cell, thereby modulating the activity of the first host cell 
or a second host cell. In one example, contacting the agent with the first human or 
non-human host cell results in the recruitment of a second host cell. The agent may 
be an antibody or antibody fragment of the invention. 

[0296] The modulation can comprise directly enhancing cell activity, 
indirectly enhancing cell activity, directly inhibiting cell activity, or indirectly 
inhibiting cell activity. The cell activity that is modulated can include transcription, 
translation, cell cycle control, signal transduction, intracellular trafficking, cell 
adhesion, cell mobility, proteolysis, ion transport, water transport, DNA repair, 
hydrolysis, lipase activity, polymerization using an RNA temple or a DNA template, 
and nuclease activity. The modulation can result in cell death or apoptosis, or 
inhibition of cell death or apoptosis, as well as cell growth, cell proliferation, or cell 
survival, or inhibition of cell growth, cell proliferation, or cell survival; as well as 
mucosal preservation, inhibition of eicosanoid synthesis, or resistance to infection by 
viruses. 

[0297] Either the first or the second host cell can be a human or a non- 
human host cell. Either the first or the second host cell can be an immune cell, e.g., a 
T cell, B cell, NK cell, dendritic cell, macrophage, muscle cell, stem cell, skin cell, fat 
cell, blood cell, brain cell, bone marrow cell, endothelial cell, retinal cell, bone cell, 
kidney cell, pancreatic cell, liver cell, spleen cell, prostate cell, cervical cell, ovarian 
cell, breast cell, lung cell, liver cell, soft tissue cell, colorectal cell, other cell of the 
gastrointestinal tract, or a cancer cell. 

[0298] The invention also provides a method of diagnosing cancer, 
proliferative, inflammatory, immune, viral, bacterial, or metabolic disorder in a 
patient, by allowing an antibody specific for a polypeptide of the invention to contact 
a patient sample, and detecting specific binding between the antibody and any antigen 
in the sample to determine whether the subject has cancer, proliferative, 
inflammatory, immune, viral, bacterial, or metabolic disorder. 

[0299] The invention further provides a method of diagnosing cancer, 
proliferative, inflammatory, immune, viral, bacterial, or metabolic disorder in a 
patient, by allowing a polypeptide of the invention to contact a patient sample, and 
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detecting specific binding between the polypeptide arid any interacting molecule in 
the sample to determine whether the subject has cancer, proliferative, inflammatory, 
immune, viral, bacterial, or metabolic disorder. 

[0300] The invention also features a method of providing a polynucleotide, a 
polypeptide, or an agent of the invention, such as an antibody, to a subject by oral, 
buccal, nasal, rectal, intraperitoneal, intradermal, transdermal, intratracheal, 
intrathecal, or parenteral administration, or otherwise by implantation or inhalation. 
For example, the polynucleotide, polypeptide or agent can be administered 
intranasally, intravenously, intra-arterially, intracardiacally, subcutaneously, 
intraperitoneally, transdermal^, intraventricularly, or intracranially. The invention 
also provides a method for formulating a polynucleotide, polypeptide, or modulator 
composition, such as an antibody composition, for delivery by any of the routes of 
administration provided above, for example, for treatment of disorders. For example, 
the parenteral delivery can be via inhalation or implantation. The parenteral delivery 
can also be oral, intranasal, intraventricular, or intracranial. 

[0301] The present invention also features a pharmaceutical composition 
comprising a polynucleotide, polypeptide, or modulator of the invention and a carrier. 
The carrier can be a pharmaceutically acceptable carrier. The modulator can be 
obtainable by any methods of the invention, for example, the modulator can be an 
antibody or a fragment thereof. Further, oral formulations, preparations for injection, 
aerosol formulations, and suppositories can be prepared, each comprising the 
polynucleotide, polypeptide, or modulator composition. Further, nucleic acid 
compositions comprising polynucleotide sequences encoding the subject antibodies, 
or fragments thereof, can be prepared for administration to a subject. 

[0302] The invention also features a non-human animal injected with the 
polynucleotide, polypeptide, or modulator composition, for example the antibody 
composition. Again, the animal can be a non-human primate, (e.g., a monkey) a 
rodent (e.g., a rat, a mouse, a hamster, a guinea pig), a chicken, cattle (e.g., a sheep, a 
goat, a horse, a pig, a cow), a rabbit, a cat, or a dog. 

[0303] In another aspect, the invention provides a method of treating a 
disorder in a subject needing or desiring such treatment, comprising administering a 
polynucleotide, polypeptide, or modulator of the invention to the subject. The subject 
can be a human or a non-human animal. The disorder can be cancer, proliferative, 
inflammatory, immune, metabolic, ulcerative, bacterial, or viral disorders. 
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[0304] For example, the method of treatment may comprise administering an 
antibody composition with a first antibody that specifically binds to a first epitope of a 
first polypeptide or a fragment thereof, or that interferes with at least one activity of 
the first polypeptide or a fragment thereof, wherein the first polypeptide is encoded by 
a nucleic acid molecule comprising a nucleotide sequence chosen from SEQ ID 
NOS.: 1 - 104, or any nucleic acid of the present invention. In certain embodiments, 
this method further comprises using a second antibody that binds specifically to or 
interferes with the activity of a second epitope of the first polypeptide or to a first 
epitope of a second polypeptide. The second polypeptide can be encoded by a nucleic 
acid molecule comprising a nucleotide sequence chosen from SEQ ID NOS.: 1-104, 
or any nucleic acid of the present invention. In certain embodiments, the antibody 
binds, or interferes with the activity of, at least one polypeptide fragment, wherein the 
fragment is an extracellular fragment of the polypeptide, or an extracellular fragment 
of the polypeptide minus the signal peptide, for the treatment, for example, of 
proliferative disorders, such as cancer. 

[0305] In other embodiments, the modulator may bind to a cell surface 
molecule that is over-expressed in the disorder. Further the modulator may be linked 
to an antibody of the invention. The antibody can be capable of initiating antibody 
dependent cell cytotoxicity, e.g., where the antibody is in turn coupled to cytotoxic 
agents. This method is applicable when the disorder is cancer, another proliferative 
disorder, inflammatory, immune, bacterial, viral, or metabolic disorder, and the cell 
surface molecule is over-expressed in a cancer cell, diseased cell or virus-infected 
cell. The cell surface molecule can be a single-transmembrane-related protein, a 
multiple-transmembrane-related protein, a kinase-related protein, a protein kinase- 
related protein, a ligase-related protein, a nuclear hormone receptor-related protein, a 
phosphatase-related protein, a protease-related protein, a phosphodiesterase-related 
protein, a kinesin-related protein, an immunoglobulin-related protein, a T-cell 
receptor-related protein, a glycosylphosphatidylinositol anchor-related protein, or 
other amino acid sequence, including, an activator-related protein, an adaptor-related 
protein, an adhesion molecule-related protein, an ATPase-related protein, an ATP- 
related protein, a breakpoint-related protein, a channel-related protein, a checkpoint- 
related protein, a complex-related protein, a dehydrogenase-related protein, a 
disintegrin-related protein, an endopeptidase-related protein, a germ-cell-related 
protein, a GTPase-related protein, a helicase-related protein, a hydrolase-related 
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protein, an integrase-related protein, an integrin-related protein, isomerase-related 
protein, a membrane-related protein, a mucin-related protein, an oxygenase-related 
protein, a peroxidase-related protein, a phopholipase-related protein, a prosaposin- 
related protein, a proteasome-related protein, a reductase-related protein, a reverse 
transcriptase^related protein, an RNase-related protein, an RNase H-related protein, an 
SH3-related protein, a synthetase-related protein, a TATA box-related protein, a Tat- 
related protein, a transferase-related protein, a transposase-related protein, a ubiquitin- 
related protein, or virus-related protein that is over-expressed in cancer, proliferative, 
inflammatory, immune, bacterial, viral, or metabolic disorder. 

[0306] The invention also provides a method for prophylactic or therapeutic 
treatment of a subject needing or desiring such treatment by providing a vaccine, that 
can be administered to the subject. The vaccine may comprise one or more of a 
polynucleotide, polypeptide, or modulator of the invention, for example an antibody 
vaccine composition, a polypeptide vaccine composition, or a polynucleotide vaccine 
composition, useful for treating cancer, proliferative, inflammatory, immune, 
metabolic, bacterial, or viral disorders. 

[0307] For example, the vaccine can be a cancer vaccine, and the 
polypeptide can concomitantly be a cancer antigen. The vaccine may be an anti- 
inflammatory vaccine, and the polypeptide can concomitantly be an inflammation- 
related antigen. The vaccine may be a viral vaccine, and the polypeptide can 
concomitantly be a viral antigen. In some embodiments, the vaccine comprises a 
polypeptide fragment, comprising at least one extracellular fragment of a polypeptide 
of the invention, and/or at least one extracellular fragment of a polypeptide of the 
invention minus the signal peptide, for the treatment, for example, of proliferative 
disorders, such as cancer. In certain embodiments, the vaccine comprises a 
polynucleotide encoding one or more such fragments, administered for the treatment, 
for example, of proliferative disorders, such as cancer. Further, the vaccine can be 
administered with or without an adjuvant. 

[0308] In another aspect, the invention provides a method for gene therapy 
by providing a polynucleotide comprising a nucleic acid molecule encoding a 
polypeptide, such as an antibody of the invention, and administering the 
polynucleotide to a subject needing or desiring such treatment. 

[0309] The invention further provides a kit comprising one or more of a 
polynucleotide, polypeptide, or modulator composition, such as an antibody 
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composition, which may include instructions for its use. Such kits are useful in 
diagnostic applications, for example, to detect the presence and/or level of a 
polypeptide in a biological sample by specific antibody interaction. 

Modes for Carrying out the Invention 
Brief Description of the Table 

[03 1 0] Each sequence shown in Table 1 is identified by a Five Prime 
Therapeutics, Inc. (FP) identification number (FP ID). Each protein in Table 1 is also 
described by an annotation of the Fantom mouse protein with the greatest degree of 
similarity to the claimed sequences. The Fantom database was compiled by the 
Fantom Consortium and is accessible, for example, at http://fantom.gsc.nken. 
go.jp/db/ (Bono et aL, 2002). It provides curated functional annotation to full-length 
mouse sequences (Okzaki et al., 2002). The similarities of the claimed sequences of 
the invention with the annotated sequences in Table 1 suggest that they may share 
structural and functional properties, and exhibit similar expression profiles and 
localizations. 
Definitions 

[031 1] "Related sequences" include nucleotide and amino acid sequences 
that are involved in the function of their referent. For example, "receptor-related 
sequences" include all sequences that are involved in receptor function. This 
includes, but is not limited to, sequences that are involved in receptor synthesis, 
receptor regulation, receptor effector function, and receptor degradation. "Related 
sequences" also encompass complementary nucleic acid sequences, and biologically 
active fragments of nucleic acid and amino acid sequences. 

[0312] The terms "polynucleotide," "nucleotide," "nucleic acid," 
"polynucleic molecule," "nucleotide molecule," "nucleic acid molecule," "nucleic acid 
sequence," "polynucleotide sequence," and "nucleotide sequence" are used 
interchangeably herein to refer to polymeric forms of nucleotides of any length. The 
polynucleotides can contain deoxyribonucleotides, ribonucleotides, and/or their 
analogs or derivatives. For example, nucleic acids can be naturally occurring DNA or 
RNA, or can be synthetic analogs, as known in the art. The terms also encompass 
genomic DNA, genes, gene fragments, exons, introns, regulatory sequences or 
regulatory elements (such as promoters, enhancers, initiation and termination regions, 
other control regions, expression regulatory factors, and expression controls), DNA 
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comprising one or more single-nucleotide polymorphisms (SNPs), allelic variants, 
isolated DNA of any sequence, and cDNA. The terms also encompass mRNA, tRNA, 
rRNA, ribozymes, splice variants, antisense RNA, antisense conjugates, RNAi, and 
isolated RNA of any sequence. The terms also encompass recombinant 
polynucleotides, heterologous polynucleotides, branched polynucleotides, labeled 
polynucleotides, hybrid DNA/RNA, polynucleotide constructs, vectors comprising the 
subject nucleic acids, nucleic acid probes, primers, and primer pairs. The 
polynucleotides can comprise modified nucleic acid molecules, with alterations in the 
backbone, sugars, or heterocyclic bases, such as methylated nucleic acid molecules, 
peptide nucleic acids, and nucleic acid molecule analogs, which may be suitable as, 
for example, probes if they demonstrate superior stability and/or binding affinity 
under assay conditions. Analogs of purines and pyrimidines, including radiolabeled 
and fluorescent analogs, are known in the art. The polynucleotides can have any 
three-dimensional structure, and can perform any function, known or as yet unknown. 
The terms also encompass single-stranded, double-stranded and triple helical 
molecules that are either DNA, RNA, or hybrid DNA/RNA and that may encode a 
full-length gene or a biologically active fragment thereof. Biologically active 
fragments of polynucleotides can encode the polypeptides herein, as well as anti-sense 
and RNAi molecules. Thus, the full length polynucleotides herein may be treated 
with enzymes, such as Dicer, to generate a library of short RNAi fragments which are 
within the scope of the present invention. 

[0313] The novel polynucleotides herein include those shown in the Table, 
SEQ ID NOS.: 1-104, and biologically active fragments thereof. The 
polynucleotides also include modified, labeled, and degenerate variants of the nucleic 
acid sequences, as well as nucleic acid sequences that are substantially similar or 
homologous to nucleic acids encoding the subject proteins. 

[0314] A "biologically active" entity, or an entity having "biological 
activity," is one having structural, regulatory, or biochemical functions of a naturally 
occurring molecule or any function related to or associated with a metabolic or 
physiological process. Biologically active polynucleotide fragments are those 
exhibiting activity similar, but not necessarily identical, to an activity of a 
polynucleotide of the present invention. The biological activity can include an 
improved desired activity, or a decreased undesirable activity. For example, an entity 
demonstrates biological activity when it participates in a molecular interaction with 
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another molecule, or when it has therapeutic value in alleviating a disease condition, 
or when it has prophylactic value in inducing an immune response to the molecule, or 
when it has diagnostic value in determining the presence of the molecule, such as a 
biologically active fragment of a polynucleotide that can be detected as unique for the 
polynucleotide molecule, or that can be used as a primer in PCR. 

[03 1 5] The term "degenerate variant" of a nucleic acid sequence refers to all 
nucleic acid sequences that can be directly translated, according to the standard 
genetic code, to provide an amino acid sequence identical to that translated from a 
reference nucleic acid sequence. 

[0316] The term "gene" or "genomic sequence" as used herein is an open 
reading frame encoding specific proteins and polypeptides, for example, an mRNA, 
cDNA, or genomic DNA, and also may or may not include intervening introns, or 
adjacent 5' and 3* non-coding nucleotide sequences involved in the regulation of 
expression up to about 20 kb beyond the coding region, and possibly further in either 
direction. A gene can be introduced into an appropriate vector for extrachromosomal 
maintenance or for integration into a host genome. 

[0317] The term "transgene" as used herein is a nucleic acid sequence that is 
incorporated into a transgenic organism. A "transgene" can contain one or more 
transcriptional regulatory sequences, and other sequences, such as introns, that may be 
useful for expressing or secreting the nucleic acid or fusion protein it encodes. 

[03 1 8] The term "cDNA" as used herein is intended to include ail nucleic 
acids that share the sequence elements of mature mRNA species, where sequence 
elements are exons and 3' and 5 5 non-coding regions. Generally, mRNA species have 
contiguous exons, the intervening introns having been removed by nuclear RNA 
splicing to create a continuous open reading frame encoding a protein. 

[0319] The term "splice variant" refers to all types of RNAs transcribed from 
a given gene that when processed collectively encode plural protein isoforms. The 
term "alternative splicing" and related terms refer to all types of RNA processing that 
lead to expression of plural protein isoforms from a single gene. Some genes are first 
transcribed as long mRNA precursors that are then shortened by a series of processing 
steps to produce the mature mRNA molecule. One of these steps is RNA splicing, in 
which the intron sequences are removed from the mRNA precursor. A cell can splice 
the primary transcript in different ways, making different "splice variants," and 
thereby making different polypeptide chains from the same gene, or from the same 
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mRNA molecule. Splice variants can include, for example, exon insertions, exon 
extensions, exon truncations, exon deletions, alternatives in the 5 'untranslated region 
and alternatives in the 3 'untranslated region. 

[0320] "Oligonucleotide" may generally refer to polynucleotides of between 
about 5 and about 100 nucleotides of single-or double-stranded nucleic acids. For the 
purposes of this disclosure, there is no upper limit to the length of an oligonucleotide. 
Oligonucleotides are also known as oligomers or oligos and can be isolated from 
genes, or chemically . synthesized by methods known in the art. 

[0321 ] "Nucleic acid composition" as used herein is a composition 
comprising a nucleic acid sequence, including one having an open reading frame that 
encodes a polypeptide and is capable, under appropriate conditions, of being 
expressed as a polypeptide. The term includes, for example, vectors, including 
plasmids, cosmids, viral vectors (e.g., retrovirus vectors such as lentivirus, 
adenovirus, and the like), human, yeast, bacterial, Pl-derived artificial chromosomes 
(HAC's, YAC's, BAC's, PAC's, etc), and mini-chromosomes, in vitro host cells, in 
vivo host cells, tissues, organs, allogenic or congenic grafts or transplants, 
multicellular organisms, and chimeric, genetically modified, or transgenic animals 
comprising a subject nucleic acid sequence. 

[0322] An "isolated," "purified," or "substantially isolated" polynucleotide, 
or a polynucleotide in "substantially pure form," in "substantially purified form," in 
"substantial purity," or as an "isolate," is one that is substantially free of the sequences 
with which it is associated in nature, or other nucleic acid sequences tiiat do not 
include a sequence or fragment of the subject polynucleotides. By substantially free 
is meant that less than about 90%, less than about 80%, less than about 70%, less than 
about 60%, or less than about 50% of the composition is made up of materials other 
than the isolated polynucleotide. For example, the isolated polynucleotide is at least 
about 50%, at least about 60%, at least about 70%, at least about 80%, at least about 
90%, at least about 95%, at least about 97%, or at least about 99% free of the 
materials with which it is associated in nature. For example, an isolated 
polynucleotide may be present in a composition wherein at least about 50%, at least 
about 60%, at least about 70%, at least about 80%, at least about 90%, at least about 
95%, at least about 97%, at least about 99% of the total macromolecules (for example, 
polypeptides, fragments thereof, polynucleotides, fragments thereof, lipids, 
polysaccharides, and oligosaccharides) in the composition is the isolated 
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polynucleotide. Where at least about 99% of the total macromolecules is the isolated 
polynucleotide, the polynucleotide is at least about 99% pure, and the composition 
comprises less than about 1% contaminant. As used herein, an "isolated," "purified" 
or "substantially isolated" polynucleotide, or a polynucleotide in "substantially pure 
form," in "substantially purified form," in "substantial purity," or as an "isolate," also 
refers to recombinant polynucleotides, modified, degenerate and homologous 
polynucleotides, and chemically synthesized polynucleotides, which, by virtue of 
origin or manipulation, are not associated with all or a portion of a polynucleotide 
with which it is associated in nature, are linked to a polynucleotide other than that to 
which it is linked in nature, or do not occur in nature. For example, the subject 
polynucleotides are generally provided as other than on an intact chromosome, and 
recombinant embodiments are typically flanked by one or more nucleotides not 
normally associated with the subject polynucleotide on a naturally-occurring 
chromosome. 

[0323] The terms "polypeptide," "peptide," and "protein," used 
interchangeably herein, refer to a polymeric form of amino acids of any length, which 
can include naturally-occurring amino acids, coded and non-coded amino acids, 
chemically or biochemically modified, derivatized, or designer amino acids, amino 
acid analogs, peptidomimetics, and depsipeptides, and polypeptides having modified, 
cyclic, bicyclic, depsicyclic, or depsibicyclic peptide backbones. The term includes 
single chain protein as well as multimers. The term also includes conjugated proteins, 
fusion proteins, including, but not limited to, GST fusion proteins, fusion proteins 
with a heterologous amino acid sequence, fusion proteins with heterologous and 
homologous leader sequences, fusion proteins with or without N-terminal methionine 
residues, pegolyated proteins, and immunologically tagged proteins. Also included in 
this term are variations of naturally occurring proteins, where such variations are 
homologous or substantially similar to the naturally occurring protein, as well as 
corresponding homologs from different species. Variants of polypeptide sequences 
include insertions, additions, deletions, or substitutions compared with the subject 
polypeptides. The term also includes peptide aptamers. 

[0324] The novel polypeptides herein include amino acid sequences encoded 
by an open reading frame (ORF), described in greater detail below, including the full 
length protein and fragments thereof, particularly biologically active fragments and/or 
fragments corresponding to functional domains, e.g., a signal peptide or leader 
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sequence, an enzyme active site, including a cleavage site and an enzyme catalytic 
site, a domain for interaction with other protein(s), a domain for binding DNA, a 
regulatory domain, a consensus domain that is shared with other members of the same 
protein family, such as a kinase family or an immunoglobulin family; an extracellular 
domain that may act as a target for antibody production or that may be cleaved to 
become a soluble receptor or a ligand for a receptor; an intracellular fragment of a 
transmembrane protein that participates in signal transduction; a transmembrane 
domain of a. transmembrane protein that may facilitate water or ion transport; a 
sequence associated with cell survival and/or cell proliferation; a sequence associated 
with cell cycle arrest, DNA repair and/or apoptosis; a sequence associated with a 
disease or disease prognosis, including types of cancer, degenerative disease, 
inflammatory disease, immunological disease, genetic disease, metabolic disease, 
and/or viral infection; and including fusions of the subject polypeptides to other 
proteins or parts, thereof; modifications of the subject polypeptide, e.g., comprising 
modified, derivatized, or designer amino acids, modified peptide backbones, and/or 
immunological tags; as well as intra- and inter-species homologs of the subject 
polypeptides. 

[0325] As noted above, a "biologically active" entity, or an entity having 
"biological activity," is one having structural, regulatory, or biochemical functions of 
a naturally occurring molecule or any function related to or associated with a 
metabolic or physiological process. Biologically active polypeptide fragments are 
those exhibiting activity similar, but not necessarily identical, to an activity of a 
polypeptide of the present invention. The biological activity can include an improved 
desired activity, or a decreased undesirable activity. For example, an entity 
demonstrates biological activity when it participates in a molecular interaction with 
another molecule, or when it has therapeutic value in alleviating a disease condition, 
or when it has prophylactic value in inducing an immune response to the molecule, or 
when it has diagnostic value in determining the presence of the molecule. A 
biologically active polypeptide or fragment thereof includes one that can participate in 
a biological reaction, for example, as a transcription factor that combines with other 
transcription factors for initiation of transcription, or that can serve as an epitope or 
immunogen to stimulate an immune response, such as production of antibodies, or 
that can transport molecules into or out of cells, or that can perform a catalytic 
activity, for example polymerization or nuclease activity, or that can participate in 
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signal transduction by binding to receptors, proteins, or nucleic acids, activating 
enzymes or substrates. 

[0326] A "signal peptide," or a "leader sequence," comprises a sequence of 
amino acid residues, typically, at the N terminus of a polypeptide, which directs the 
intracellular trafficking of the polypeptide. Polypeptides that contain a signal peptide 
or leader sequence typically also contain a signal peptide or leader sequence cleavage 
site. Such polypeptides, after cleavage at the cleavage sites, generate mature 
polypeptides, for example, after extracellular secretion or after being directed to the 
appropriate intracellular compartment. 

[0327] "Depsipeptides" are compounds containing a sequence of at least two 
alpha-amino acids and at least one alpha-hydroxy carboxylic acid, which are bound 
through at least one normal peptide link and ester links, derived from the hydroxy 
carboxylic acids. "Linear depsipeptides" can comprise rings formed through S-S 
bridges, or through an hydroxy or a mercapto group of an hydroxy-, or mercapto- 
amino acid and the carboxyl group of another amino- or hydroxy-acid but do not 
comprise rings formed only through peptide or ester links derived from hydroxy 
carboxylic acids. "Cyclic depsipeptides" are peptides containing at least one ring 
formed only through peptide or ester links, derived from hydroxy carboxylic acids. 

[0328] An "isolated," "purified," or "substantially isolated" polypeptide, or a 
polypeptide in "substantially pure form," in "substantially purified form," in 
"substantial purity," or as an "isolate," is one that is substantially free of the materials 
with which it is associated in nature or other polypeptide sequences that do not 
include a sequence or fragment of the subject polypeptides. By substantially free is 
meant that less than about 90%, less than about 80%, less than about 70%, less than 
about 60%, or less than about 50% of the composition is made up of materials other 
than the isolated polypeptide. For example, the isolated polypeptide is at least about 
50%, at least about 60%, at least about 70%, at least about 80%, at least about 90%, at 
least about 95%, at least about 97%, or at least about 99% free of the materials with 
which it is associated in nature. For example, an isolated polypeptide may be present 
in a composition wherein at least about 50%, at least about 60%, at least about 70%, 
at least about 80%, at least about 90%, at least about 95%, at least about 97%, or at 
least about 99% of the total macromolecules (for example, polypeptides, fragments 
thereof, polynucleotides, fragments thereof, lipids, polysaccharides, and 
oligosaccharides) in the composition is the isolated polypeptide. Where at least about 
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99% of the total macromolecules is the isolated polypeptide, the polypeptide is at least 
about 99% pure, and the composition comprises less than about 1% contaminant. As 
used herein, an "isolated," "purified," or "substantially isolated" polypeptide, or a 
polypeptide in "substantially pure form," in "substantially purified form," in 
"substantial purity," or as an "isolate," also refers to recombinant polypeptides, 
modified, tagged and fusion polypeptides, and chemically synthesized polypeptides, 
which by virtue or origin or manipulation, are not associated with all or a portion of 
the materials with which they are associated in nature, are linked to molecules other 
than that to which they are linked in nature, or do not occur in nature. 

[0329] Detection methods of the invention can be qualitative or quantitative. 
Thus, as used herein, the terms "detection," "identification," "determination," and the 
like, refer to both qualitative and quantitative determinations, and include 
"measuring." For example, detection methods include methods for detecting the 
presence and/or level of polynucleotide or polypeptide in a biological sample, and 
methods for detecting the presence and/or level of biological activity of 
polynucleotide or polypeptide in a sample. 

[0330] As used herein, the term "array" or "microarray" may be used 
interchangeably and refers to a collection of plural biological molecules such as 
nucleic acids, polypeptides, or antibodies, having locatable addresses that may be 
separately detectable. Generally, "microarray" encompasses use of sub microgram 
quantities of biological molecules. The biological molecules may be affixed to a 
substrate or may be in solution or suspension. The substrate can be porous or solid, 
planar or non-planar, unitary or distributed, such as a glass slide, a 96 well plate, with 
or without the use of microbeads or nanobeads. As such, the term "microarray" 
includes all of the devices referred to as microarrays in Schena, 1999; Bassett et al., 
1999; Bowtell, 1999; Brown and Botstein, 1999; Chakravarti, 1999; Cheung et al., 
1999; Cole et al., 1999; Collins, 1999; Debouck and Goodfellow, 1999; Duggan et al., 
1999; Hacia, 1999; Lander, 1999; Lipshutz et al., 1999; Southern, et al., 1999; 
Schena, 2000; Brenner et al, 2000; Lander, 2001; Steinhaur et aL, 2002; and Espejo et 
al, 2002. Nucleic acid microarrays include both oligonucleotide arrays (DNA chips) 
containing expressed sequence tags ("ESTs") and arrays of larger DNA sequences 
representing a plurality of genes bound to the substrate, either one of which can be 
used for hybridization studies. Protein and antibody microarrays include arrays of 
polypeptides or proteins, including but not limited to, polypeptides or proteins 
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obtained by purification, fusion proteins, and antibodies, and can be used for specific 
binding studies (Zhu and Snyder, 2003; Houseman et aL, 2002; Schaeferling et aL, 
2002; Weng et aL, 2002; Winssinger et aL, 2002; Zhu et aL, 2001; Zhu et aL 2001; 
and MacBeath arid Schreiber, 2000). 

[033 1] A "nucleic acid hybridization reaction", is one in which single strands 
of DNA or RNA randomly collide with one another, and bind to each other only when 
their nucleotide sequences have some degree of complementarity. The solvent and 
temperature conditions can be varied in the reactions to modulate the extent to which 
the molecules can bind to one another. Hybridization reactions can be performed 
under different conditions of "stringency." The "stringency" of a hybridization 
reaction as used herein refers to the conditions (e.g., solvent and temperature 
conditions) under which two nucleic acid strands will either pair or fail to pair to form 
a "hybrid" helix. 

[0332] "T m " is the temperature in degrees Celsius at which 50% of a 
polynucleotide duplex made of complementary strands of nucleic acids that are 
hydrogen bonded in an anti-parallel direction by Watson-Crick base pairing dissociate 
into single strands under conditions of the hybridization reaction. T m can be predicted 
according to a standard formula, such as: T m = 81.5 + 16.6 logpC 4 "] + 0.41 (%G/C) - 
0.61 (%F) - 600/L, where [X*] is the cation concentration (usually sodium ion, Na*) in 
mol/L; (%G/C) is the number of G and C residues as a percentage of total residues in 
the duplex; (%F) is the percent formamide in solution (wt/vol); and L is the number of 
nucleotides in each strand of the paired nucleic acids. 

[0333] A "buffer" is a system that tends to resist change in pH when a given 
increment of hydrogen ion or hydroxide ion is added. Buffered solutions contain 
conjugate acid-base pairs. Any conventional buffer can be used with the inventions 
herein including but not limited to, for example, Tris, phosphate, imidazole, and 
bicarbonate. 

[0334] A "library" of polynucleotides comprises a collection of sequence 
information of a plurality of polynucleotide sequences, which information is provided 
in either biochemical form (e.g., as a collection of polynucleotide molecules), or in 
electronic form (e.g., as a collection of polynucleotide sequences stored in a 
computer-readable form, as in a computer-based system, a computer data file, and/or 
as part of a computer program). 
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[0335] A "library" of polypeptides comprises a collection of sequence 
information of a plurality of polypeptide sequences, which information is provided in, 
e.g., a collection of polypeptide sequences stored in a computer-readable form, as in a 
computer-based system, a computer data file, and/or as part of a computer program. 

[0336] "Media" refers to a manufacture, other than an isolated nucleic acid 
molecule, that contains the sequence information of the present invention. Such a 
manufacture provides the genome sequence or a subset thereof in a form that can be 
examined by means not directly applicable to the sequence as it exists in a nucleic 
acid, e.g., with computer-readable media comprising data storage structures. Such 
media include, but are not limited to: magnetic storage media, such as a floppy disc, a 
hard disc storage medium, and a magnetic tape; optical storage media such as CD- 
ROM; electrical storage media such as RAM and ROM; and hybrids of these 
categories such as magnetic/optical storage media. 

[0337] "Recorded" refers to a process for storing information on computer 
readable media, using any such methods as known in the art. 

[033S] As used herein, "a computer-based system" refers to the hardware 
means, software means,.and data storage means used to analyze the nucleotide 
sequence information of the present invention. The minimum hardware of the 
computer-based systems of the present invention comprises a central processing unit 
(CPU), input means, output means, and data storage means. A skilled artisan can 
readily appreciate that any one of the currently available computer-based systems are 
suitable for use in the present invention. The data storage means can comprise any 
manufacture comprising a recording of the present sequence information as described 
above, or a memory access means that can access such a manufacture. 

[0339] "Search means" refers to one or more programs implemented on the 
computer-based system, to compare a target sequence or target structural motif, or 
expression levels of a polynucleotide in a sample, with the stored sequence 
information. A variety of known algorithms are publicly known and commercially 
available, e.g., MacPattern (EMBL), BLAST, BLASTN and BLASTX (NCBI), 
gapped BLAST, BLAZE, the Wise package, FASTX, Clustalw, FASTA, FASTA3, 
AlignO, TCoffee, BestFit, FastDB, and TeraBLAST (TimeLogic, Crystal Bay, 
Nevada). Search means can be used to identify fragments or regions of the genome 
that match a particular target sequence or target motif, for example, based on 
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sequence similarity, for example, to identify open reading frames (ORFs) within the 
genome that contain homology to ORFs from other organisms. 

[0340] "Sequence similarity," "sequence homology," "homology," "sequence 
identity," and "percent sequence identity," used interchangeably herein, describe the 
degree ofrelatedness between two polynucleotide or polypeptide sequences. In 
general, "identity" means the exact match-up of two or more nucleotide sequences or 
two or more amino acid sequences, where the nucleotide or amino acids being 
compared are the same. Also, in general, "similarity" or "homology" means the exact 
match-up of two or more nucleotide sequences or two or more amino acid sequences, 
where the nucleotide or amino acids being compared are either the same or possess 
similar chemical and/or physical properties. The terms also refer to the percentage of 
the "aligned" bases (for the polynucleotides) or amino acid residues (for the 
polypeptides) that are identical when the sequences are aligned. Sequences can be 
aligned in a number of different ways and sequence similarity can be determined in a 
number of different ways. For example, the bases or amino acid residues of one 
sequence can be aligned to a gap in the other sequence, or they can be aligned only to 
another base or amino acid residue in the other sequence. A gap can range anywhere 
from one nucleotide, base, or amino acid residue to multiple exons in length, up to 
any number of nucleotides or amino acid residues. Further, sequences can be aligned 
such that nucleotides (or bases) align with nucleotides, nucleotides align with amino 
acid residues, or amino acid residues align with amino acid residues. 

[0341] A "target sequence" can be any polynucleotide or amino acid 
sequence of six or more contiguous nucleotides or two or more amino acids, for 
example, from about 5 or from about 10 to about 100 amino acids, or from about 15 
or from about 30 to about 300 nucleotides. A variety of comparing means can be 
used to accomplish comparison of sequence information from a sample (e.g., to 
analyze target sequences, target motifs, or relative expression levels) with the data 
storage means. A skilled artisan can readily recognize that any one of the publicly 
available homology search programs can be used as the search means for the 
computer based systems of the present invention to accomplish comparison of target 
sequences and motifs. Computer programs to analyze expression levels in a sample 
and in controls are also known in the art. A "target sequence" includes an "antibody 
target sequence," which refers to an amino acid sequence that can be used as an 
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immunogen for injection into animals for production of antibodies or for screening 
against a phage display or antibody library for identification of binding partners. 

[0342] A "target structural motif," or "target motif," refers to any rationally 
selected sequence or combination of sequences in which the sequence(s) are chosen 
based on a three-dimensional configuration that is formed upon the folding of the 
target motif, or on consensus sequences of regulatory or active sites. There are a 
variety of target motifs known in the art. Protein target motifs include, but are not 
limited to, enzyme active sites and signal sequences. Nucleic acid target motifs 
include, but are not limited to, hairpin structures, promoter sequences, and other 
expression elements such as binding sites for transcription factors. 

[0343] A "matrix" is a geometric network of antibody molecules and their 
antigens, as found in immunoprecipitation and flocculation reactions. An antibody 
matrix can exist in solution or on a solid phase support. 

[0344] The term "binds specifically," in the context of antibody binding, 
refers to high avidity and/or high affinity binding of an antibody to a specific 
polypeptide, of more accurately, to an epitope of a specific polypeptide. Antibody 
binding to such epitope on a polypeptide can be stronger than binding of the same 
antibody to any other epitopes, particularly other epitopes that can be present in 
molecules in association with, or in the same sample as the polypeptide of interest. 
For example, when an antibody binds more strongly to one epitope than to another, 
adjusting the binding conditions can result in antibody binding almost exclusively to 
the specific epitope and not to any other epitopes on the same polypeptide, and not to 
any other polypeptide, which does not comprise the epitope. Antibodies that bind 
specifically to a subject polypeptide may be capable of binding other polypeptides at a 
weak, yet detectable, level (e.g., 10% or less of the binding shown to the polypeptide 
of interest). Such weak binding, or background binding, is readily discernible from 
the specific antibody binding to a subject polypeptide, e.g., by use of appropriate 
controls. In general, antibodies of the invention bind to a specific polypeptide with a 
binding affinity of 10" 7 M or greater (e.g., 10' 8 M, 10' 9 M, 10' 10 , 10 n , etc.). 

[0345] The term "host cell" includes an individual cell, cell line, cell culture, 
or in vivo cell, which can be or has been a recipient of any polynucleotides or 
polypeptides of the invention, for example, a recombinant vector, an isolated 
polynucleotide, antibody or fusion protein. Host cells include progeny of a single 
host cell, and the progeny may not necessarily be completely identical (in 
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morphology, physiology, or in total DNA, RNA, or polypeptide complement) to the 
original parent cell due to natural, accidental, or deliberate mutation and/or change. 
Host cells can be prokaryotic or eukaryotic, including mammalian, insect, amphibian, 
reptile, crustacean, avian, fish, plant and fungal cells. A host cell includes cells 
transformed, transfected, transduced, or infected in vivo or in vitro with a 
polynucleotide of the invention, for example, a recombinant vector. A host cell which 
comprises a recombinant vector of the invention may be called a "recombinant host 
cell." 

[0346] "Biological sample," "patient sample," "clinical sample" "sample," or 
"biological specimen," used interchangeably herein, encompasses a variety of sample 
types obtained from an individual, including biological fluids such as blood, serum, 
plasma, urine, cerebrospinal fluid, tears, saliva, lymph, dialysis fluid, lavage fluid, 
semen, and other liquid samples or tissues of biological origin. It includes tissue 
samples and tissue cultures or cells derived therefrom and the progeny thereof, 
including cells in culture, cell supematants, and cell lysates. It includes organ or 
tissue culture derived fluids, tissue biopsy samples, tumor biopsy samples, stool 
samples, and fluids extracted from physiological tissues. Cells dissociated from solid 
tissues, tissue sections, and cell lysates are included. The definition also includes 
samples that have been manipulated in any way after their procurement, such as by 
treatment with reagents, solubilization, or enrichment for certain components, such as 
polynucleotides or polypeptides. Also included in the temi are derivatives and 
fractions of biological samples. A biological sample can be used in a diagnostic, 
monitoring, or screening assay. 

[0347] The terms "individual," "host," "patient," and "subject," used 
interchangeably herein, refer to a mammal, including, but not limited to, murines, 
simians, humans, felines, canines, equines, bovines, porcines, ovines, caprineis, 
mammalian farm animals, mammalian sport animals, and mammalian pets. 
"Mammals" or "mammalian," are used broadly to describe organisms which are 
within the class mammalia, including the orders carnivore (e.g., dogs and cats), 
rodentia (e.g., mice, guinea pigs, and rats), and other mammals, including cattle, 
goats/sheep, cows, horses, rabbits, and pigs, and primates (e.g., humans, 
chimpanzees, and monkeys). 

[0348] The terms "agent," "substance," "modulator," and "compound" are 
used interchangeably herein. These terms refer to a substance that binds to or 
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modulates a level or activity of a subject polypeptide or a level of mRNA encoding a 
subject protein or nucleic acid, or that modulates the activity of a cell containing the 
subject protein or nucleic acid . Where the agent modulates a level of mRNA 
encoding a subject protein, agents include ribozymes, antisense, and RNAi molecules. 
Where the agent is a substance that modulates a level of activity of a subject 
polypeptide, agents include antibodies specific for the subject polypeptide, peptide 
aptamers, small molecules, agents that bind a ligand-binding site in a subject 
polypeptide, and the like. Antibody agents include antibodies that specifically bind a 
subject polypeptide and activate the polypeptide, such as receptor-ligand binding that 
initiates signal transduction; antibodies that specifically bind a subject polypeptide 
and inhibit binding of another molecule to the polypeptide, thus preventing activation 
of a signal transduction pathway; antibodies that bind a subject polypeptide to 
modulate transcription; antibodies that bind a subject polypeptide to modulate 
translation; as well as antibodies that bind a subject polypeptide on the surface of a 
cell to initiate antibody-dependent cytotoxicity (" ADCC") or to initiate cell killing or 
cell growth. Small molecule agents include those that bind the polypeptide to 
modulate activity of the polypeptide or cell containing the polypeptide in a similar 
fashion. The term "agent" also refers to substances that modulate a condition or 
disorder associated with a subject polynucleotide or polypeptide. Such agents include 
subject polynucleotides themselves, subject polypeptides themselves, and the like. 
Agents may be chosen from amongst candidate agents, as defined below. 

[0349] The terms "candidate agent," "subject agent," or "test agent," used 
interchangeably herein, encompass numerous chemical classes, typically synthetic, 
semi-synthetic, or naturally occurring inorganic or organic molecules, small 
molecules, or macromolecular complexes. Candidate agents can be small organic 
compounds having a molecular weight of more than about 50 and less than about 
2,500 daltons. Candidate agents can comprise functional groups necessary for 
structural interaction with proteins, particularly hydrogen bonding, and can include at 
least an amine, carbonyl, hydroxyl or carboxyl group, and can contain at least two of 
the functional chemical groups. The candidate agents can comprise cyclical carbon or 
heterocyclic structures and/or aromatic or polyaromatic structures substituted with 
one or more of the above functional groups. Candidate agents are also found among 
biomolecules, including oligonucleotides, polynucleotides, and fragments thereof, 
depsipeptides, polypeptides and fragments thereof, oligosaccharides, polysaccharides 
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and fragments thereof, lipids, fatty acids, steroids, purines, pyrimidines, derivatives 
thereof, structural analogs, modified nucleic acids, modified, derivatized or designer 
amino acids, or combinations thereof. 

[0350] An "agent which modulates a biological activity of a subject 
polypeptide," as used herein, describes any substance, synthetic, semi-synthetic, or 
natural, organic or inorganic, small molecule or macromolecular, pharmaceutical or 
protein, with the capability of altering a biological activity of a subject polypeptide or 
of a fragment thereof, as described herein. Generally, a plurality of assay mixtures is 
run in parallel with different agent concentrations to obtain a differential response to 
the various concentrations. Typically, one of these concentrations serves as a 
negative control, i.e., at zero concentration or below the level of detection. The 
biological activity can be measured using any assay known in the art. 

[0351] An agent which modulates a biological activity of a subject 
polypeptide increases or decreases the activity at least about 10%, at least about 15%, 
at least about 20%, at least about 25%, at least about 50%, at least about 100%, or at 
least about 2-fold, at least about 5-fold, or at least about 10-fold or more when 
compared to a suitable control. 

[0352] The term "agonist" refers to a substance that mimics the function of 
an active molecule. Agonists include, but are not limited to, drugs, hormones, 
antibodies, and neurotransmitters, as well as analogues and fragments thereof. 

[0353] The term "antagonist" refers to a molecule that competes for the 
binding sites of an agonist, but does not induce an active response. Antagonists 
include, but are not limited to, drugs, hormones, antibodies, and neurotransmitters, as 
well as analogues and fragments thereof. 

[0354] The term "receptor" refers to a polypeptide that binds to a specific 
extracellular molecule and may initiate a cellular response. 

[0355] The term "ligand" refers to any molecule that binds to a specific site 
on another molecule. 

[0356] The term "modulate" encompasses an increase or a decrease, a 
stimulation, inhibition, or blockage in the measured activity when compared to a 
suitable control. "Modulation" of expression levels includes increasing the level and 
decreasing the level of an mRNA or polypeptide encoded by a polynucleotide of the 
invention when compared to a control lacking the agent being tested. In some 
embodiments, agents of particular interest are those which inhibit a biological activity 
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of a subject polypeptide, and/or which reduce a level of a subject polypeptide in a 
cell, and/or which reduce a level of a subject mRNA in a cell and/or which reduce the 
release of a subject polypeptide from a eukaryotic cell. In other embodiments, agents 
of interest are those that increase a biological activity of a subject polypeptide, and/or 
which increase a level of a subject polypeptide in a cell, and/or which increase a level 
of a subject mRNA in a cell and/or which increase the release of a subject polypeptide 
from a eukaryotic cell. 

[0357] An agent that "modulates the level of expression of a nucleic acid" in 
a cell is one that brings about an increase or decrease of at least about 1.25-fold, at 
least about 1 .5-fold, at least about 2-fold, at least about 5-fold, at least about 10-fold, 
or more in the level (i.e., an amount) of mRNA and/or polypeptide following cell 
contact with a candidate agent compared to a control lacking the agent. 

[0358] "Modulating a level of active subject polypeptide" includes 
increasing or decreasing activity of a subject polypeptide; increasing or decreasing a 
level of active polypeptide protein; increasing or decreasing a level of mRNA 
encoding active subject polypeptide, and increasing or decreasing the release of 
subject polypeptide for a eukaryotic cell. In some embodiments, an agent is a subject 
polypeptide, where the subject polypeptide itself is administered to an individual. In 
some embodiments, an agent is an antibody specific for a subject polypeptide. In 
some embodiments, an agent is a chemical compound such as a small molecule that 
may be useful as an orally available drug. Such modulation includes the recruitment 
of other molecules that directly effect the modulation. For example, an antibody that 
modulates the activity of a subject polypeptide that is a receptor on a cell surface may 
bind to the receptor and fix complement, activating the complement cascade and 
resulting in lysis of the cell. 

[0359] The term "over-expressed" refers to a state wherein there exists any 
measurable increase over normal or baseline levels. For example, a molecule that is 
over-expressed in a disorder is one that is manifest in a measurably higher level 
compared to levels in the absence of the disorder. 

[0360] "Treatment," "treating," and the like, as used herein, refer to 
obtaining a desired pharmacologic and/or physiologic effect, covering any treatment 
of a pathological condition or disorder in a mammal, including a human. The effect 
may be prophylactic in terms of completely or partially preventing a disorder or 
symptom thereof and/or may be therapeutic in terms of a partial or complete cure for 
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a disorder and/or adverse affect attributable to the disorder. That is, "treatment" 
includes (1) preventing the disorder from occurring or recurring in a subject who may 
be predisposed to the disorder but has not yet been diagnosed as having it, (2) 
inhibiting the disorder, such as arresting its development, (3) stopping or terminating 
the disorder or at least symptoms associated therewith, so that the host no longer 
suffers from the disorder or its symptoms, such as causing regression of the disorder 
or its symptoms, for example, by restoring or repairing a lost, missing or defective 
function, or stimulating an inefficient process, or (4) relieving, alleviating, or 
ameliorating the disorder, or symptoms associated therewith, where ameliorating is 
used in a broad sense to refer to at least a reduction in the magnitude of a parameter, 
such as inflammation, pain, and/or tumor size. 

- [0361] A "pharmaceutical^ acceptable carrier," "pharmaceutically 
acceptable diluent," or "pharmaceutically acceptable excipient," or "pharmaceutically 
acceptable vehicle," used interchangeably herein, refer to a non-toxic solid, semisolid 
or liquid filler, diluent, encapsulating material or formulation auxiliary of any 
conventional type. A pharmaceutically acceptable carrier is non-toxic to recipients at 
the dosages and concentrations employed and is compatible with other ingredients of 
the formulation. For example, the carrier for a formulation containing polypeptides 
would not normally include oxidizing agents and other compounds that are known to 
be deleterious to polypeptides. Suitable carriers include, but are not limited to, water, 
dextrose, glycerol, saline, ethanol, and combinations thereof. The carrier can contain 
additional agents such as wetting or emulsifying agents, pH buffering agents, or 
adjuvants which enhance the effectiveness of the formulation. Adjuvants of the 
invention include, but are not limited to Freunds's, Montanide ISA Adjuvants [Seppic, 
Paris, France], Ribi's Adjuvants (Ribi ImmunoChem Research, Inc., Hamilton, MT), 
Hunter's TiterMax (CytRx Corp., Norcross, GA), Aluminum Salt Adjuvants 
(Alhydrogel - Superfos of Denmark/Accurate Chemical and Scientific Co., Westbury, 
NY), Nitrocellulose- Adsorbed Protein, Encapsulated Antigens, and Gerbu Adjuvant 
(Gerbu Biotechnik GmbH, Gaiberg, Germany/C-C Biotech, Poway, CA). Topical 
carriers include liquid petroleum, isopropyl palmitate, polyethylene glycol, ethanol 
(95%), polyoxyethylene monolaurate (5%) in water, or sodium lauryl sulfate (5%) in 
water. Other materials such as anti-oxidants, humectants, viscosity stabilizers, and 
similar agents can be added as necessary. Percutaneous penetration enhancers such as 
Azone can also be included. 
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[0362] "Pharmaceutically acceptable salts" include the acid addition salts 
(formed with the free amino groups of the polypeptide) and which are formed with 
inorganic acids such as, for example, hydrochloric or phosphoric acids, or such 
organic acids as acetic, mandelic, oxalic, and tartaric. Salts formed with the free 
carboxyl groups can also be derived from inorganic bases such as, for example, 
sodium, potassium, ammonium, calcium, or ferric hydroxides, and such organic bases 
as isopropylamine, trimetliylamine, 2-ethylamino elhanol, and histidine. 

[0363] Compositions for oral administration can form solutions, 
suspensions, tablets, pills, capsules, sustained release formulations, oral rinses, or 
powders. 

[0364] The term "unit dosage form," as used herein, refers to physically 
discrete units suitable as unitary dosages for human and animal subjects, each unit 
containing a predetermined quantity of compounds of the present invention calculated 
in an "effective amount," that is, a dosage sufficient to produce the desired result or 
effect in association with a pharmaceutically acceptable carrier. The specifications for 
the novel unit dosage forms of the present invention depend on the particular 
compound employed, the host, and the effect to be achieved, as well as the 
pharmacodynamics associated with each compound in the host. 
Compositions 

[0365] The present invention provides novel isolated polynucleotides 
encoding polypeptides and fragments thereof. The present invention also provides 
novel isolated polypeptides, fragments thereof, and compositions comprising same. 
The present invention further provides polynucleotide compositions that can be used 
to identify the polypeptides. 

[0366] The present invention provides recombinant vectors and host cells for 
use in gene expression, primer pairs for use in hybridizations, computer-based 
embodiments for use in bioinformatics, and transgenic animals and embryonic stem 
cell lines for use in mutating and regulating gene expression. 

Nucleic Acids 

Sequences 

[0367] This invention provides genes encoding proteins, the encoded 
proteins, and fragments and homologs thereof. It provides human polynucleotide 
sequences and the corresponding mouse polynucleotide sequences. 
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[0368] The nucleic acids of the subject invention can encode all or a part of 
the subject proteins. Double or single stranded fragments can be obtained from the 
DNA sequence by chemically synthesizing oligonucleotides in accordance with 
conventional methods, for example by restriction enzyme digestion or polymerase 
chain reaction (PCR) amplification. The use of the polymerase chain reaction has 
been described (Saiki et al., 1985) and current techniques have been reviewed 
(Sambrook et al., 1989; McPherson et al. 2000; Dieffenbach and Dveksler, 1995). 
For the most part, DNA fragments will be of at least about 5 nucleotides, at least 
about 8 nucleotides, at least about 10 nucleotides, at least about 15 nucleotides, at 
least about 18 nucleotides, at least about 20 nucleotides, at least about 25 nucleotides, 
at least about 30 nucleotides, or at least about 50 nucleotides, at least about 75 
nucleotides, or at least about 100 nucleotides. Nucleic acid compositions that encode 
at least six contiguous amino acids (i.e., fragments of 18 nucleotides or more), for 
example, nucleic acid compositions encoding at least 8 contiguous amino acids (i.e., 
fragments of 24 nucleotides or more), are useful in directing the expression or the 
synthesis of peptides that can be used as immunogens (Lerner, 1982; Shinnick et al., 

1983; Sutcliffe et al., 1983). 

[0369] In some embodiments, a polynucleotide of the invention comprises a 
nucleotide sequence of at least about 5, at least about 8, at least about 10, at least 
about 15, at least about 18, at least about 20, at least about 25, at least about 30, at 
least about 50, at least about 75, at least about 1 00, at least about 1 50, at least about 
200, at least about 250, at least about 300, at least about 350, at least about 400, at 
least about 450, at least about 500, at least about 550, at least about 600, at least about 
650, at least about 700, at least about 750, at least about 800, at least about 850, at 
least about 900, at least about 950, at least about 1 000, at least about 1 1 00, at least 
about 1200, at least about 1300, at least about 1400, at least about 1500, at least about 
1600, at least about 1700, at least about 1800, at least about 1900, at least about 2000, 
at least about 2100, at least about 2200, at least about 2300, at least about 2400, at 
least about 2500, at least about 3000, at least about 4000, or at least about 5000 
contiguous nucleotides of any one of the sequences shown in SEQ ID NOS.: 1-104, or 
the coding region thereof, or a complement thereof. 

[0370] In other embodiments, a polynucleotide of the invention has at least 
about 60%, 70%, at least about 75%, at least about 80%, at least about 85%, at least 
about 90%, at least about 95%, at least about 97%, at least about 98%, or at least 
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about 99% nucleotide sequence identity with a nucleotide sequence, or a fragment 
thereof, of the coding region of any one of the sequences shown in SEQ ID NOS.: 1- 
104, or a complement thereof. These sequence variants include naturally-occurring 
variants (e.g., SNPs, allelic variants, and homologs from other species), degenerate 
variants, variants associated with disease or pathological states, and variants resulting 
from random or directed mutagenesis, as well as from chemical or other modification. 

[0371] In some embodiments, a polynucleotide of the invention comprises a 
nucleotide sequence .that encodes a polypeptide comprising an amino acid sequence of 
at least about 5, at least about 8, at least about 10, at least about 15, at least about 18, 
at least about 20, at least about 25, at least about 30, at least about 50, at least about 
75, at least about 100, at least about 150, at least about 200, at least about 250, at least 
about 300, at least about 350, at least about 400, at least about 450, at least about 500, 
at least about 550, at least about 600, at least about 650, at least about 700, at least 
about 750, at least about 800, at least about 850, at least about 900, at least about 950, 
or at least about 1000 contiguous amino acids of at least one of the sequences encoded 
by SEQ ID NOS.: 1-104. 

[0372] In some embodiment, the present invention includes the present 
polynucleotide selected from SEQ ID NOS.: 1 - 104, which contain 300 bp of 5' 
terminus of a protein encoding polynucleotide sequence. Such a polynucleotide is 
useful for the purposes of clustering gene sequences to determine gene family. 

[0373] In fiirther embodiments, a polynucleotide of the invention hybridizes 
under stringent hybridization conditions to a polynucleotide having the coding region 
of any one of the sequences shown in SEQ ID NOS.: 1 - 104, or a complement 
thereof. 

[0374] The polynucleotides of the invention include those that encode 
variants of the polypeptide sequences encoded by the polynucleotides of the Sequence 
Listing. In some embodiments, these polynucleotides encode variant polypeptides 
that include insertions, additions, deletions, or substitutions compared with the 
polypeptides encoded by the nucleotide sequences shown in SEQ ID NOS.: 1-104, 
and in Table 1. Conservative amino acid substitutions include serine/threonine, 
valine/leucine/isoleucine, asparagine/histidine/glutamine, glutamic acid/aspartic acid, 
etc. (Gonnet et al., 1992). 

[0375] The nucleic acids of the invention include degenerate variants that 
can be translated, according to the standard genetic code, to provide an amino acid 
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sequence identical to that translated from the nucleic acid sequences herein. For 
example, synonymous codons include GGG, GGA, GGC, and GGU, each encoding 
Glycine. 

[0376] The nucleic acids of the invention include single nucleotide 
polymorphisms (SNPs), which occur frequently in eukaryotic genomes (Lander, et al. 
2001). The nucleotide sequence determined from one individual of a species can 
differ from other allelic forms present within the population. 

[0377] The nucleic acids of the invention include homologs of the 
polynucleotides. The source of homologous genes can be any species, e.g., primate 
species, particularly human; rodents, such as rats, hamsters, guinea pigs, and mice; 
rabbits, canines, felines; catties, such as bovines, goats, pigs, sheep, equines, 
crustaceans, birds, chickens, reptiles, amphibians, fish, insects, plants, frmgi, yeast, 
nematodes, etc. Among mammalian species, e.g., human and mouse, homologs have 
substantial sequence similarity, e.g., at least about 60% sequence identity, at least 
about 75% sequence identity, or at least about 80% sequence identity among 
nucleotide sequences. In many embodiments of interest, homology will be at least 
about 75%, at least about 80% ,at least about 85%, at least about 90%, at least about 
95%, at least about 97%, or at least about 98%, where in certain embodiments of 

i 

interest homology will be as high as about 99%. 

[0378] Modifications in the native structure of nucleic acids, including 
alterations in the backbone, sugars or heterocyclic bases, have been shown to increase 
intracellular stability and binding affinity. Among useful changes in the backbone 
chemistry are phosphorothioates; phosphorodithioates, where both of the 
non-bridging oxygens are substituted with sulfur; phosphoroamidites; alkyl 
phosphotriesters and boranophosphates. Achiral phosphate derivatives include 
3'-0'-5'-S-phosphorothioate, 3'-S-5'-0- phosphorothioate, 3'-CH 2 -5'-0-phosphonate 
and 3'-NH-5 -O-phosphoroamidate. Peptide nucleic acids replace the entire ribose 
phosphodiester backbone with a peptide linkage. 

[0379] Sugar modifications are also used to enhance stability and affinity. 
The ot-anomer of deoxyribose can be used, where the base is inverted with respect to 
the natural p-anomer. The 2'-OH of the ribose sugar can be altered to form 2'-0- 
methyl or 2 -O-allyl sugars, which provides resistance to degradation without 
comprising affinity. 
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[0380] Modification of the heterocyclic bases must maintain proper base 
pairing. Some useful substitutions include deoxyuridine for deoxythymidine; 
5-methyl-2 - deoxycytidine and 5-bromo-2 -deoxycytidine for deoxycytidine. 5- 
propynyl-2 - deoxyuridine and 5-propynyl-2 -deoxycytidine have been shown to 
increase affinity and biological activity when substituted for deoxythymidine and 
deoxycytidine, respectively. 

[03S1] A genomic sequence of interest comprises the nucleic acid present 
between the initiation codon and the stop codon, as defined in the listed sequences, 
including all of the introns that are normally present in a native chromosome. It can 
further include the 3 ' and 5 9 untranslated regions found in the mature mRNA. It can 
further include specific transcriptional and translational regulatory sequences, such as 
promoters, enhancers, etc., including about 1 kb, about 2 kb, and possibly more, of 
flanking genomic DN A at either the 5 9 or 3 ' end of the transcribed region. The 
genomic DNA can be isolated as a fragment of 100 kbp or smaller; and substantially 
free of flanking chromosomal sequence. The genomic DNA flanking the coding 
region, either 3' or 5', or internal regulatory sequences as sometimes found in introns, 
contains sequences required for proper tissue and stage specific expression. 

[0382] Nucleic acid molecules of the invention can comprise heterologous 
nucleic acid molecules, i.e., nucleic acid molecules other than the subject nucleic acid 
molecules, of any length. For example, the subject nucleic acid molecules can be 
flanked on the 5' and/or 3' ends by heterologous nucleic acid molecules of from about 
1 nucleotide to about 10 nucleotides, from about 10 nucleotides to about 20 
nucleotides, from about 20 nucleotides to about 50 nucleotides, from about 50 
nucleotides to about 100 nucleotides, from about 100 nucleotides to about 250 
nucleotides, from about 250 nucleotides to about 500 nucleotides, or from about 500 
nucleotides to about 1000 nucleotides, or more in length. 

[0383] The subject polynucleotides include those that encode fusion proteins 
comprising the subject polypeptides fused to "fusion partners." For example, the 
present soluble receptor or ligand can be fused to an immunoglobulin fragment, such 
as an Fc fragment for stability in circulation or to fix complement. Other polypeptide 
fragments that have equivalent capabilities as the Fc fragments can also be used 
herein. 

[0384] The isolated nucleic acids of the invention can be used as probes to 
detect and characterize gross alteration in a genomic locus, such as deletions, 
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insertions, translocations, and duplications, e.g., applying fluorescence in situ 
hybridization (FISH) techniques to examine chromosome spreads (Andreeff et al., 
1999). The nucleic acids are also usefiil for detecting smaller genomic alterations, 
such as deletions, insertions, additions, translocations, and substitutions (e.g., SNPs). 

[0385] When used as probes to detect nucleic acid molecules capable of 
hybridizing with nucleic acids described in the Sequence Listing, the nucleic acid 
molecules can be flanked by heterologous sequences of any length. When used as 
probes, a subject nucleic acid can include nucleotide analogs that incorporate labels 
that are directly detectable, such as radiolabels or fluorophores, or nucleotide analogs 
that incorporate labels that can be visualized in a subsequent reaction, such as biotin 
or various haptens. Haptens that are commonly conjugated to nucleotides for 
subsequent labeling include biotin, digoxigenin, and dinitrophenyl. 

[0386] Suitable fluorescent labels include fluorochromes e.g., fluorescein 
and its derivatives, e.g., fluorescein isothiocyanate (FITC6-carboxyfluorescein (6- 
FAM), 2',7'-dimethoxy-4',5 -dichloro-6-carboxyfluorescein (JOE), ), 6-carboxy- 
2',4',7',4,7-hexachlorofluorescein (HEX), 5-carboxyfluorescein (5-FAM); coumarin 
and its derivatives, e.g., 7-amino-4-methylcoumarin, aminocoumarin; bodipy dyes, 
such as Bodipy FL; cascade blue; Oregon green; rhodamine dj'es, e.g., rhodamine, 6- 
carboxy-X-rhodamine (ROX), Texas red, phycoerythrin, and tetramethylrhodamine; 
eosins and erythrosins; cyanine dyes, e.g., allophycocyanin, Cy3 and Cy5 or 
N,N,N',N -tetramethyl-6-carboxyrhodamine (TAMRA); macrocyclic chelates of 
lanthanide ions, e.g., quantum dye, etc; and chemiluminescent molecules, e.g., 
luciferases. 

[0387] Fluorescent labels also include a green fluorescent protein (GFP), i.e., 
a "humanized" version of a GFP, e:g., wherein codons of the naturally-occurring 
nucleotide sequence are changed to more closely match human codon bias; a GFP 
derived from Aequoria victoria or a derivative thereof, e.g., a "humanized" derivative 
such as Enhanced GFP, which are available commercially, e.g., from Clontech, Inc.; 
other fluorescent mutants of a GFP from Aequoria victoria, e.g., as described in U.S. 
Patent No. 6,066,476; 6,020,192; 5,985,577; 5,976,796; 5,968,750; 5,968,738; 
5,958,71 3; 5,91 9,445; 5,874,304; a GFP from another species such as Renilla 
reniformis, Renilla mulleri, or Ptilosarcus guemyi, as previously described (WO 
99/49019; Peelle et al., 2001), "humanized" recombinant GFP (hrGFP) (Stratagene®); 
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any of a variety of fluorescent and colored proteins from Anthozoan species, (e.g., 
Matzetal., 1999). 

[0388] Probes can also contain fluorescent analogs, including commercially 
available fluorescent nucleotide analogs that can readily be incorporated into a subject 
nucleic acid. These include deoxyribonucleotides and/or ribonucleotide analogs 
labeled with Cy3, Cy5, Texas Red, Alexa Fluor dyes, rhodamine, cascade blue, or 
BODIPY, and the like. 

[0389] Suitable radioactive labels include, e.g., 32 P, 35 S, or 3 H. For 
example, probes can contain radiolabeled analogs, including those commonly labeled 
with 32 P or 35 S, such as a- 32 P-dATP, -dTTP, -dCTP, and dGTP; Y - 35 S-GTP and a- 35 S- 
dATP, and the like. 

[0390] Nucleic acids of the invention can also be bound to a substrate. 
Subject nucleic acids can be attached covalently, attached to a surface of the support 
or applied to a derivatized surface in a chaotropic agent that facilitates denaturation 
and adherence, e.g., by noncovalent interactions, or some combination thereof. The 
nucleic acids can be bound to a substrate to which a plurality of other nucleic acids 
are concurrently bound,, hybridization to each of the plurality of the bound nucleic 
acids being separately detectable. 

[0391] The substrate can be porous or solid, planar or non-planar, unitary or 
distributed; and the bond between the nucleic acid and the substrate can be covalent or 
non-covalent. The substrate can be in the form of microbeads or nanobeads. 
Substrates include, but are not limited to, a membrane, such as nitrocellulose, nylon, 
positively-charged derivatized nylon; a solid substrate such as glass, amorphous 
silicon, crystalline silicon, plastics (including e.g., polymethylacTylic, polyethylene, 
polypropylene; polyacrylate, polymethylmethacrylate, polyvinylchloride, 
polytetrafluoroethylene, polystyrene, polycarbonate, polyacetal, polysulfone, cellulose 
acetate, or mixtures thereof). 

[0392] The subject nucleic acids include antisense RNA, ribozymes, and 
RNAi. Further, The nucleic acids of the invention can be used for antisense or RNAi 
inhibition of transcription or translation using methods known in the art (Phillips, 
1999a; Phillips, 1999b; Hartmann et al., 1999; Stein et al., 1998; Agrawal et al., 
1998). 
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Expression Vectors 

[0393] The instant invention further provides host cells, e.g., recombinant 
host cells, that comprise a subject nucleic acid, host cells that comprise a recombinant 
vector, and host cells that secrete antibodies of the invention. Subject host cells can 
be cultured in vifro, or can be part of a multicellular organism. Host cells are 
described in more detail below. The instant invention further provides transgenic 
plants and non-human animals, as described in more detail below. 

[0394] In addition to the plurality of uses described in greater detail in 
following sections, the subject nucleic acids find use in the preparation of all or a 
portion of the polypeptides of the subject invention, as described above, using an 
expression system. For expression, an expression vector can be employed. The 
expression vector will provide a transcriptional and translational initiation region, 
which may be inducible, conditionally-active, or constitutive, or tissue-specific, where 
the coding region is operably linked under the transcriptional control of the 
transcriptional initiation region, and a transcriptional and translational termination 
region. These control regions can be native to a gene encoding the subject peptides, 
or can be derived from heterologous or exogenous sources. 

[0395] The subject nucleic acids can also be provided as part of a vector 
(e.g., a polynucleotide construct comprising an expression cassette), a wide variety of 
which are known in the art. Vectors include, but are not limited to, plasmids; 
cosmids; viral vectors; human, yeast, bacterial, PI -derived artificial chromosomes 
(HAC's, YAC's, BAC's, PAC's, etc.), mini-chromosomes, and the like. Vectors are 
amply described in numerous publications well known to those in the art (Ausubel, et 
al.; Jones et al., 1998a; Jones et al., 1998b). Vectors can provide for nucleic acid 
expression, for nucleic acid propagation, or both. 

[0396] A recombinant vector or construct that includes a nucleic acid of the 
invention is useful for propagating a nucleic acid in a host cell; such vectors are 
known as "cloning vectors." Vectors can transfer nucleic acid between host cells 
derived from disparate organisms; these are known in the art as "shuttle vectors." 
Vectors can also insert a subject nucleic acid into a host cell's chromosome; these are 
known in the art as "insertion vectors." Vectors can express either sense or antisense 
RNA transcripts of the invention in vitro (e.g., in a cell-free system or within an in 
vitro cultured host cell) or in vivo (e.g., in a multicellular plant or animal); these are 
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known in the art as "expression vectors," which can be part of an expression system. 
Expression vectors can also produce a subject antibody. 

[0397] Vectors typically include at least one origin of replication, at least 
one site for insertion of heterologous nucleic acid (e.g., in the form of a polylinker 
with multiple, tightly clustered, single cutting restriction endonuclease recognition 
sites), and at least one selectable marker, although some integrative vectors will lack 
an origin that is functional in the host to be chromosomally modified, and some 
vectors will lack selectable markers. Vectors are transiently or stably be maintained 
in the cells, usually for a period of at least about one day, at least about several days to 
at least about several weeks. 

[0398] Prior to vector insertion, the DNA of interest will be obtained 
substantially free of other nucleic acid sequences. The DNA can be "recombinant," 
and flanked by one or more nucleotides with which it is not normally associated on a 
naturally occurring chromosome. 

[0399] Expression vectors generally have convenient restriction sites located 
near the promoter sequence to provide for the insertion of nucleic acid sequences 
encoding heterologous protein or RNA molecules. A selectable marker operative in 
the expression system or host can be present. Expression vectors can be used for the 
production of fusion proteins, where the fusion peptide provides additional 
functionality, i.e., increased protein synthesis, a leader sequence for secretion, 
stability, reactivity with defined antisera, or an enzyme marker, e.g., |3-galactosidase. 

[0400] Promoters of the invention can be naturally contiguous or not 
naturally contiguous to the expressed nucleic acid molecule, to the nucleic acid 
molecule. Promoter can be inducible, a conditionally-active (such as the cre-lox 
promoter), constitutive, and/or tissue-specific. 

[0401] Expression vectors can be prepared comprising a transcription 
cassette comprising a transcription initiation region, the gene or fragment thereof, and 
a transcriptional termination region. Of particular interest is the use of DNA 
sequences that allow for the expression of functional epitopes or domains, at least 
about 5, at least about 8, at least about 10, at least about 15, at least about 18, at least 
about 20, at least about 25, at least about 30, at least about 50, at least about 75, at 
least about 100, at least about 150, at least about 200, at least about 250, at least about 
300, at least about 350, at least about 400, at least about 450, at least about 500, at 
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least about 550, at least about 600, at least about 650, at least about 700, at least about 
750, at least about 800, at least about 850, at least about 900, at least about 950, or at 
least about 1000 amino acids in length, or any of the above-described fragments, up to 
and including the complete open reading frame of the gene. After introduction of 
these DNA sequences, the cells containing the vector construct can be selected by 
means of a selectable marker, and the selected cells expanded and used as expression- 
competent host cells. 

[0402] Host cells can comprise prokaryotes or eukaryotes that express 
proteins and polypeptides in accordance with conventional methods, the method 
depending on the purpose for expression. For large scale production of the protein, a 
unicellular organism, such as is. coli, B, subtilis, S. ceresnsiae, insect cells in 
combination with baculovirus vectors, or cells of a higher organism such as 
vertebrates, particularly mammals, e.g., COS 7 cells, can be used as the expression 
host cells. In some situations, it is desirable to express eukaryotic genes in eukaryotic 
cells, where the encoded protein will benefit from native folding and post- 
translational modifications. 

[0403] Specific expression systems of interest include plants, bacteria, yeast, 
insect cells, and mammalian cell-derived expression systems. Representative systems 

from each of these categories are provided below. 

[0404] Expression systems in plants include those described in U.S. Patent 
No. 6,096,546 and U.S. Patent No. 6,127,145. 

[0405] Expression systems in bacteria include those described by Chang et 

al., 1978; Goeddel et al., 1979; Goeddel et al., 1980; EP 0 036,776; U.S. Patent No. 

4,55 1,433; DeBoer et al., 1983); and Siebenlist et al., 1980. 

[0406] Expression systems in yeast include those described by Hinnen et al., 

1978; Ito et al., 1983; Kurtz et al., 1986; Kunze et al., 1985; Gleeson et al., 1986; 

Roggenkamp et al., 1986; Das et al., 1984; De Louvencourt et al., 1983; Van den 

Berg et al., 1990; Kunze et al., 1985; Cregg et al., 1985; U.S. Patent Nos. 4,837,148 

and 4,929,555; Beach and Nurse, 1981; Davidow et aL, 1985; Gaillardin et al., 1985; 

Ballance et al., 1983; Tilbum et al., 1983; Yelton et al., 1984; Kelly and Hynes, 1985; 

EP 0 244,234; WO 91/00357; and U.S. Patent No. 6,080,559. 

[0407] Expression systems for heterologous genes in insects include those 

described in U.S. Patent No. 4,745,051; Friesen et al., 1986; EP 0 127,839; EP 0 

155,476; Vlak et al., 1988; Miller et al., 1988; Carbonell et al., 19S8; Maeda et al., 
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1985; Lebacq-Verheyden et al., 1988; Smith et aL, 1985); Miyajima et al., 1987; and 
Martin et al., 1988. Numerous baculoviral strains and variants and corresponding 
permissive insect host cells are described in Luckow et al., 1988, Miller et al., 1986, 
and Maeda et al., 1985. The insect cell expression system is useful not only for 
production of heterologous proteins intracellularly, but can be used for expression of 
transmembrane proteins on the insect cell surfaces. Such insect cells can be used as 
immunogen for production of antibodies, for example, by injection of the insect cells 
into mice or rabbits or other suitable animals, for production of antibodies. 

[0408] Mammalian expression systems include those described in Dijkema 
et al., 1985; Gorman et al., 1982; Boshart et al., 1985; and U.S. Patent No. 4,399,216. 
Additional features of mammalian expression are facilitated as described in Ham and 
Wallace, 1979; Barnes and Sato, 1980 U.S. Patent Nos. 4,767,704, 4,657,866, 
4,927,762, 4,560,655, WO 90/103430, WO 87/00195, and U.S. RE 30,985. 
Mammalian cell expression systems can also be used for production of antibodies. 

[0409] The present polynucleotides can also be used in cell-free expression 
systems such as bacterial system, e.g., E. coli lysate, rabbit reticulocyte lysate system, 
wheat germ extract system, frog oocyte lysate system, and the like which is 
conventional in the art. See, for example, WO 00/68412, WO 01/27260, WO 
02/24939, WO 02/38790, WO 91/02076, and WO 91/02075. 

[0410] When any of the above-referenced host cells, or other appropriate 
host cells or organisms, are used to replicate and/or express the polynucleotides of the 
invention, the resulting replicated nucleic acid, RNA, expressed protein or 
polypeptide, is within the scope of the invention as a product of the host cell or 
organism. 

[041 1] Once the gene corresponding to a selected polynucleotide is 
identified, its expression can be regulated in the gene's native cell types. For example, 
an endogenous gene of a cell can be regulated by an exogenous regulatory sequence 
inserted into the genome of the cell at a location that will enhance or reduce 
expression of the gene corresponding to the subject polypeptide. Hie regulatory 
sequence can be designed to integrate into the genome via homologous 
recombination, as disclosed in U.S. Patent Nos. 5,641,670 and 5,733,761, the 
disclosures of which are herein incorporated by reference. Alternatively, it can be 
designed to integrate into the genome via non-homologous recombination, as 
described in WO 99/15650, the disclosure of which is also herein incorporated by 
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reference. Also encompassed in the subject invention is the production of proteins 
without manipulating the encoding nucleic acid itself, but rather by integrating a 
regulatory sequence into the genome of a cell that already includes a gene that 
encodes the protein of interest; this production method is described in the above- 
incorporated patent documents. 
Isolated Primer Pairs 

[0412] In some embodiments, the invention provides isolated nucleic acids 
that, when used as primers in a polymerase chain reaction, amplify a subject 
polynucleotide, or a polynucleotide containing a subject polynucleotide. The 
amplified polynucleotide is from about 20 to about 50, from about 50 to about 75, 
from about 75 to about 100, from about 100 to about 125, from about 125 to about 
150, from about 150 to about 175, from about 175 to about 200, from about 200 to 
about 250, from about 250 to about 300, from about 300 to about 350, from about 350 
to about 400, from about 400 to about 500, from about 500 to about 600, from about 
600 to about 700, from about 700 to about 800, from about 800 to about 900, from 
about 900 to about 1000, from about 1000 to about 2000, from about 2000 to about 
3000, from about 3000 to about 4000, from about 4000 to about 5000, or from about 
5000 to about 6000 nucleotides or more in length. 

[0413] The isolated nucleic acids themselves are from about 10 to about 20, 
from about 20 to about 30, from about 30 to about 40, from about 40 to about 50, 
from about 50 to about 100, or from about 100 to about 200 nucleotides in length. 
Generally, the nucleic acids are used in pairs in a polymerase chain reaction, where 
they are referred to as "forward" and "reverse" primers. 

[0414] Thus, in some embodiments, the invention provides a pair of isolated 
nucleic acid molecules, each from about 10 to about 200 nucleotides in length, the 
first nucleic acid molecule of the pair comprising a sequence of at least 10 contiguous 
nucleotides having 100% sequence identity to a nucleic acid sequence as shown in 
SEQ ID NOS.: 1 - 104 and the second nucleic acid molecule of the pair comprising a 
sequence of at least 10 contiguous nucleotides having 100% sequence identity to the 
reverse complement of the nucleic acid sequence shown in SEQ ID NOS.: 1-104, 
wherein the sequence of the second nucleic acid molecule is located 3' of the nucleic 
acid sequence of the first nucleic acid molecule shown in SEQ ID NOS.: 1-104. The 
primer nucleic acids are prepared using any known method, e.g., automated synthesis, 
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and can be chosen to specifically amplify a cDNA copy of an mRNA encoding a 
subject polypeptide. 

[041 5] In some embodiments, the first and/or the second nucleic acid 
molecules comprise a detectable label. The label can be a radioactive molecule, 
fluorescent molecule or another molecule, e.g., hapten, as described in detail above. 
Further, the label can be a two stage system, where the amplified DNA is conjugated 
to another molecule, i.e., biotin, digoxin, or a hapten, that has a high affinity binding 
partner, i.e., avidin, antidigoxin, or a specific antibody, respectively, and the binding 
partner conjugated to a detectable label. The label can be conjugated to one or both of 
die primers. Alternatively, the pool of nucleotides used in the amplification is 
labeled, so as to incorporate the label into the amplification product. 

[04 1 6] Conditions that increase stringency of both DN A/DNA and 
DNA/RNA hybridization reactions are widely known and published in the art. See, 
for example, Sambrook, 1989, and examples provided above. Examples of relevant 
conditions include (in order of increasing stringency): incubation temperatures of 
25°C, 37°C, 50°C, and 68°C; buffer concentrations of 10 x SSC, 6 x SSC, 1 x SSC, 0.1 
x SSC (where 1 x SSC is 0.15 M NaCl and 15 mM citrate buffer); and their 
equivalents using other buffer systems; formamide concentrations of 0%, 25%, 50%, 
and 75%; incubation times from 5 minutes to 24 hours; 1, 2, or more washing steps; 
wash incubation times of 1, 2, or 15 minutes; and wash solutions of 6 x SSC, 1 x SSC, 
0.1 x SSC, or deionized water. 

[041 7] For example, "high stringency conditions" include hybridization in 
50% formamide, 5X SSC, 0.2 fig/^1 poly(dA), 0.2 ^g/^il human cotl DNA, and 0.5% 
SDS, in a humid oven at 42°C overnight, followed by successive washes in IX SSC, 
0.2% SDS at 55°C for 5 minutes, followed by washing at 0.1 X SSC, 0.2% SDS at 
55°C for 20 minutes. Further examples of high stringency conditions include 
hybridization at 50°C and O.lxSSC (15 mM sodium chloride/1.5 mM sodium citrate); 
overnight incubation at 42°C in a solution containing 50% formamide, 1 x SSC (150 
mM NaCl, 15 mM sodium citrate), 50 mM sodium phosphate (pH 7.6), 5 * 
Denhardt's solution, 10% dextran sulfate, and 20 jig/ml denatured, sheared salmon 
sperm DNA, followed by washing the filters in 0.1 x SSC at about 65°. High 
stringency conditions also include aqueous hybridization (e.g., free of formamide) in 
6X SSC (where 20X SSC contains 3.0 M NaCl and 0.3 M sodium citrate), 1% sodium 
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dodecyl sulfate (SDS) at 65°C for about 8 hours (or more), followed by one or more 
washes in 0.2 X SSC, 0.1% SDS at 65°C. Highly stringent hybridization conditions 
are hybridization conditions that are at least as stringent as any one of the above 
representative conditions. Other stringent hybridization conditions are known in the 
art and can also be employed to identify nucleic acids of this particular embodiment 
of the invention. 

[041 S] Conditions of "reduced stringency," suitable for hybridization to 
molecules encoding structurally and functionally related proteins, or otherwise 
serving related or associated functions, are the same as those for high stringency 
conditions but with a reduction in temperature for hybridization and washing to lower 
temperatures (e.g., room temperature or about 22°C to 25°C). For example, moderate 
stringency conditions include aqueous hybridization (e.g., free of formamide) in 6X 
SSC, 1% SDS at 65°C for about 8 hours (or more), followed by one or more washes in 
2X SSC, 0.1% SDS at room temperature. Low stringency conditions include, for 
example, aqueous hybridization at 50°C and 6xSSC (0.9 M sodium chloride/0.09 M 
sodium citrate) and washing at 25°C in lxSSC (0.15 M sodium chloride/0.015 M 
sodium citrate). 

[04 1 9] The specificity of a hybridization reaction allows any single-stranded 
sequence of nucleotides to be labeled with a radioisotope or chemical and used as a 
probe to find a complementary strand, even in a cell or cell extract that contains 
millions of different DNA and RNA sequences. Probes of this type are widely used to 
detect the nucleic acids corresponding to specific genes, both to facilitate the 
purification and characterization of the genes after cell lysis and to localize them in 
cells, tissues, and organisms. 

[0420] Moreover, by carrying out hybridization reactions under conditions of 
"reduced stringency," a probe prepared from one gene can be used to find 
homologous evolutionary relatives - both in the same organism, where the relatives 
form part of a gene family, and in other organisms, where the evolutionary history of 
the nucleotide sequence can be traced. A person skilled in the art would recognize 
how to modify the conditions to achieve the requisite degree of stringency for a 
particular hybridization. 
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Libraries 

[0421] The polynucleotide libraries of the invention generally comprise a 
collection of sequence information of a plurality of polynucleotide sequences, where 
at least one of the polynucleotides has a sequence shown in SEQ ID NOS.: 1 - 104. 
By plurality is meant at least 2, at least 3, or at least all of the sequences in the 
Sequence Listing. The information may be provided in either biochemical form (e.g., 
as a collection of polynucleotide molecules), or in electronic form (e.g., as a 
collection of polynucleotide sequences stored in a computer-readable form, as in a 
computer-based system, a computer data file, and/or as a part of a computer program). 
The length and number of polynucleotides in the library will vary with the nature of 
the library, e.g., if the library is an oligonucleotide array, a cDNA array, or a 
computer database of the sequence information. 

[0422] The sequence information contained in either a biochemical or an 
electronic library of polynucleotides can be used in a variety of ways, e.g., as a 
resource for gene discovery, as a representation of sequences expressed in a selected 
cell type (e.g., cell type markers), or as markers of a given disorder or disease state. 
In general, a disease marker is a representation of a gene product that is present in all 
cells affected by disease either at an increased or decreased level relative to a normal 
cell (e.g., a cell of the same or similar type that is not substantially affected by 
disease). For example, a polynucleotide sequence in a library can be a polynucleotide 
that represents an mRNA, polypeptide, or other gene product encoded by the 
polynucleotide, that is either over-expressed or under-expressed in one cell compared 
to another (e.g., a first cell type compared to a second cell type; a normal cell 
compared to a diseased cell; a cell not exposed to a signal or stimulus compared to a 
cell exposed to that signal or stimulus; and the like). 

[0423] The nucleotide sequence information of the library can be embodied 
in any suitable form, e.g., electronic or biochemical forms. For example, a library of 
sequence information embodied in electronic form comprises an accessible computer 
data file that may contain the representative nucleotide sequences of genes that are 
differentially expressed (e.g., over-expressed or under-expressed) as between, e.g., a 
first cell type compared to a second cell type (e.g., expression in a brain cell compared 
to expression in a kidney cell); a normal cell compared to a diseased cell (e.g., a non- 
cancerous cell compared to a cancerous cell); a cell not exposed to an internal or 
external signal or stimulus compared to a cell exposed to that signal or stimulus (e.g., 
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a cell contacted with a ligand compared to a control cell not contacted with the 
ligand); and the like. Other combinations and comparisons of cells will be readily 
apparent to the ordinarily skilled artisan. Biochemical embodiments of the library 
include a collection of nucleic acid molecules that have the sequences of the genes in 
the library, where the nucleic acids can correspond to the entire gene in the library or 
to a fragment thereof, as described in greater detail below. 

[0424] Where the library is an electronic library, the nucleic acid sequence 
information can be present in a variety of media. For example, the nucleic acid 
sequences of any of the polynucleotides shown in SEQ ID NOS.: 1 - 104 can be 
recorded on computer readable media of a computer-based system, e.g., any medium 
that can be read and accessed directly by a computer. One of skill in the art can 
readily appreciate how any of the presently known computer readable mediums can 
be used to create a manufacture comprising a recording of the present sequence 
information. Any convenient data storage structure can be chosen, based on the 
means used to access the stored information. A variety of data processor programs 
and formats can be used for storage, e.g., word processing text file, database format, 
etc. In addition to the sequence information, electronic versions of the libraries of the 
invention can be provided in conjunction or connection with other computer-readable 
information and/or other types of computer-based files (e.g., searchable files, 
executable files, etc, including, but not limited to, for example, search program 
software, etc.). 

[0425] By providing the nucleotide sequence in computer readable form in a 
computer-based system, the information can be accessed for a variety of purposes. 
Computer software to access sequence information is publicly available. 
Conventional bioinformatics tools can be utilized to analyze sequences to determine 
sequence identity, sequence similarity, and gap information. For example, the gapped 
BLAST (Altschul et al., 1990, Altschul et al., 1997), and BLAZE (Brutlag et al., 
1993) search algorithms on a Sybase system, or the TeraBLAST (TimeLogic, Crystal 
Bay, Nevada) program optionally running on a specialized computer platform 
available from TimeLogic, can be used to identify open reading frames (ORFs) within 
the genome that contain homology to ORFs from other organisms. Homology 
between sequences of interest can be determined using the local homology algorithm 
of Smith and Waterman, 1981, as well as the BestFit program (Rechid et al., 1989), 
and the FastDB algorithm (FastDB, 1988; described in Current Methods in Sequence 
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Comparison and Analysis, Macromolecule Sequencing and Synthesis, Selected 
Methods and Applications, pp. 127-149, 1988, Alan R. Liss, Inc). 

[0426] Alignment programs that permit gaps in the sequence include 
Clustalw (Thompson et al., 1994), FASTA3 (Pearson, 2000) AlignO (Myers and 
Miller, 1988), and TCoffee (Notredame et al., 2000). Other methods for comparing 
and aligning nucleotide and protein sequences include, for example, BLASTX 

(NCBI), the Wise package (Birney and Durbin, 2000), and FASTX (Pearson, 2000). 

* * • •. 

These algorithms determine sequence homology between nucleotide and protein 
sequences without translating the nucleotide sequences into protein sequences. Other 
techniques for alignment are also known in the art (Doolittle, et al., 1996; BLAST, 
available from the National Center for Biotechnology Information; FASTA, available 
in the Genetics Computing Group (GCG) package, from Madison, Wisconsin, USA, a 
wholly owned subsidiary of Oxford Molecular Group, Inc.; Schlessinger, 1988a; 
Schlessinger, 1988b; and Needleman and Wunch, 1970). 

[0427] Sequence similarity is calculated based on a reference sequence, 
which may be a subset of a larger sequence, such as a conserved motif, coding region, 
flanking region, etc. The reference sequence is usually at least about 1 8 nt long, at 
least about 30 nt long, or may extend to the complete sequence that is being 
compared. 

[0428] One parameter for determining percent sequence identity is the 
percentage of the alignment in the region of strongest alignment between a target and 
a query sequence. Methods for determining this percentage involve, for example, 
counting the number of aligned bases of a query sequence in the region of strongest 
alignment and dividing this number by the total number of bases in the region. For 
example, 10 matches divided by 1 1 total residues gives a percent sequence identity of 
approximately 90.9%. The length of the aligned region is typically at least about 
55%, at least about 58%, or at least about 60% of the total sequence length, and can 
be as great as about 62%, as great as about 64%, and even as great as about 66% of 
the total sequence length. 

[0429] The present invention includes human and mouse polynucleotide and 
polypeptide sequences that are at least about 95%, at least about 96%, at least about 
97%, at least about 98%, or at least about 99% homologous to the sequences in the 
Sequence Listing, based on using the method of determining sequence identity with 
the insertion of gaps to detect the maximum degree of sequence identity. In other 
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embodiments of interest, homology will be at least about 80%, at least about 85%, or 

as high as about 90%. 

[0430] A variety of structural formats for the input and output means can be 
used to input and output the information in the computer-based systems of the present 
invention. One format for an output means ranks the relative expression levels of 
different polynucleotides. Such presentation provides a skilled artisan with a ranking 
of relative expression levels to determine a gene expression profile. 

[043 1 ] As discussed above, the library of the invention also encompasses 
biochemical libraries of the polynucleotides shown in SEQ ID NOS.: 1 - 104 and 
2463 - 3697, e.g., collections of nucleic acids representing the provided 
polynucleotides. The biochemical libraries can take a variety of forms, e.g., a solution 
of cDNAs, a pattern of probe nucleic acids stably associated with a surface of a solid 
support (i.e., an array) and the like. Of particular interest are nucleic acid arrays in 
which one or more of the polynucleotide sequences shown in SEQ ID NOS.: 1-104 
is represented on the array. A variety of different array formats have been developed 
and are known to those of skill in the art. The arrays of the subject invention find use 
in a variety of applications, including gene expression analysis, drug screening, 
mutation analysis, and the like, as disclosed in the herein-listed exemplary patent 
documents. 

[0432] In addition to the above nucleic acid libraries, analogous libraries of 
polypeptides are also provided, where the polypeptides of the library will represent at 
least a portion of the polypeptides encoded by a gene corresponding to one or more of 
the sequences shown in SEQ ID NOS.: 1 - 104. 

[0433] Further, analogous libraries of antibodies are also provided, where the 
libraries comprise antibodies or fragments thereof that specifically bind to at least a 
portion of at least one of the subject polypeptides. Further, antibody libraries may 
comprise antibodies or fragments thereof that specifically inhibit binding of a subject 
polypeptide to its ligand or substrate, or that specifically inhibit binding of a subject 
polypeptide as a substrate to another molecule. Moreover, corresponding nucleic acid 
libraries are also provided, comprising polynucleotide sequences that encode the 
antibodies or antibody fragments described above. 
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Polypeptides 

Sequences 

[0434] This invention provides novel polypeptides, and related polypeptide 
compositions. The novel polypeptides of the invention encompass proteins encoded 
by the nucleic acids having nucleotide sequences shown in SEQ ID NOS.: 1 - 104. 
The subject polypeptides are human polypeptides, fragments thereof, variants (such as 
splice variants), homologs from other species, and derivatives thereof. In particular 
embodiments,, a polypeptide of the invention has an amino acid sequence substantially 
identical to the sequence of any polypeptide encoded by a polynucleotide sequence 
shown in SEQ ID NOS.: 1 - 104. 

[0435] These polypeptides may reside within the cell, or extracellularly. 
They may be secreted from the cell, reside in the cytoplasm, in the membranes, or in 
any of the intracellular organelles, including the nucleus, mitochondria, ribosomes, or 
storage granules. 

[0436] In many embodiments, a novel polypeptide of the invention functions 
as a secreted protein, a single-transmembrane protein, a multiple-transmembrane 
protein, a kinase, a protein kinase, a ligase, a nuclear hormone receptor, a 
phosphatase, a protease, a phosphodiesterase, a kinesin, an immunoglobulin, a T-cell 
receptor, or a glycosylphosphatidylinositol anchor. A novel polypeptide of the 
invention can also possess one or more of the following functions or properties: (1) 
an activator functioning to regulate one or more genes by increasing the rate of 
transcription, (2) an activator functioning to positively modulate an allosteric enzyme, 
(3) an adaptor ftinctioning to sort cargo molecules into transport vesicles, (4) an 
adaptor functioning to form a clathrin-coated vesicle, (5) an adhesion molecule 
functioning to mediate the adhesion of cells with other cells and/or the extracellular 
matrix, (6) an ATPase ftinctioning to move ions or small molecules across a 
membrane against a chemical concentration gradient or electrical potential, (7) an 
ATPase functioning to translocate nucleotides across membranes, (8) a breakpoint- 
related sequence functioning as an oncoprotein, (9) a breakpoint-related sequence 
functioning as a tumor-specific antigen, (10) a channel functioning as a water channel, 
(1 1) a channel ftinctioning as an ion channel, (12) a checkpoint-related sequence 
functioning at DNA damage checkpoints, (13) a checkpoint-related sequence 
functioning at replication checkpoints, (14) a checkpoint-related sequence functioning 
to initiate signal transduction cascades eliciting cell cycle arrest, DNA repair, or 
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apoptosis, (15) a complex functioning as a protein scaffold, (16) a complex 
functioning in ADP-ribosylation, ( 1 7) a dehydrogenase functioning to synthesize 
amino acids, (18) a disintegrin functioning to inhibit blood clotting, (19) a disintegrin 
functioning as a metallopeptidase, (20) a GTPase functioning as a negative regulator 
of p53, (21) a GTPase functioning to stimulate ras GTPase activity, (22) a helicase 
functioning in DNA replication, (23) a hydrolase functioning in proprionate 
metabolism, (24) an integrase functioning to integrate a DNA copy of a retroviral 
genome into a host chromosome, (25) an integrin functioning as a tumor marker, (26) 
an integrin functioning in cell migration, (27) an isomerase functioning as an 
immunosuppressant, (28) a membrane protein functioning as a scaffolding component 
at the cytoplasmic face of a lipid raft, (29) a membrane protein functioning as a ligand 
for a receptor tyrosine kinase, (30) oxygenases and peroxidases functioning as 
antioxidants, (31) a phospholipase functioning in eicosanoid synthesis, (32) a 
phospholipase functioning in preserving the intestinal mucosa, (33) a prosaposin 
functioning in lipid catabolism, (34) a proteasome component functioning in muscle 
wasting, (35) a reductase-related sequence functioning as a coenzyme A reductase 
inhibitor, (36) a reverse transcriptase functioning as an RNA-dependent reverse 
transcriptase, (37) a reverse transcriptase functioning as a DNA-dependent reverse 
transcriptase, (38) an RNase functioning in viral assembly, (39) an RNase H 
functioning to form oligonucleotides that prime DNA synthesis, (40) an RNase H 
functioning to cleave the RNA strand of an RNA-DNA hybrid, (41) SH3 domains 
functioning in actin cytoskeletal organization, (42) SH3 domains functioning in signal 
transduction, (43) a synthetase functioning as an autoantigen (44) synthetases 
functioning in nucleotide sugar phosphate synthesis, (45) TATA boxes functioning as 
a transcription initiators, (46) tat functioning as a transcriptional coactivator, (47) 
transferases functioning in signal transduction, (48) transposases functioning as gene 
transfer agents, (49) ubiquitins functioning to protect cells against tumor necrosis 
factor induced cell death, (50) proteasome components and ubiquitin functioning in 
protein degradation, (51) a virus-related sequence functioning to confer resistance to 
infection by viruses, (52) other sequences of the invention interacting with one or 
more proteins, (53) other sequences of the invention enzymatically modifying one or 
more proteins, (54) other sequences of the invention binding one or more small 
molecule ligands, (55) other sequences of the invention binding one or more peptides, 
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(56) other sequences of the invention binding one or more carbohydrates, and (57) 
other sequences of the invention functioning in vesicular transport. 

[0437] In some embodiments, the present novel polypeptide modulates the 
cells or tissues of animals, particularly humans, such as, for example, by stimulating, 
enhancing or inhibiting T or B cell fiinction or the function of other hematopoeitic 
cells or bone marrow cells; modulates adult or embryonic stem cell or precursor cell 
growth or differentiation; modulates cell fiinction or activity of neuronal cells or other 
cells of the CNS, heart cells, liver cells, kidney cells, lung cells, pancreatic cells, 
gastrointestinal cells, spleen cells, breast cells, prostate cells, ovarian cells, and the 
like. 

[0438] In some embodiments, a subject polypeptide is present as a multimer. 
Multimers include homodimers, homotrhners, homotetramers, and multimers that 
include more than four monomeric units. Multimers also include heteromultimers, 
e.g., heterodimers, heterotrimers, heterotetramers, etc. where the subject polypeptide 
is present in a complex with proteins other than the subject polypeptide. Where the 
multimer is a heteromultimer, the subject polypeptide can be present in a 1 : 1 ratio, a 
1:2 ratio, a 2:1 ratio, or other ratio, with the other protein(s). 

[0439] In addition to the above specifically listed proteins, polypeptides 
from other species are also provided, including mammals, such as: primates, rodents, 
e.g., mice, rats, hamsters, guinea pigs; domestic animals, e.g., sheep, pig, horse, cow, 
goat, rabbit, dog, cat; and humans, as well as non-mammalian species, e.g., avian, 
reptile and amphibian, insect, crustacean, fish, plant, fungus, and protozoa. 

[0440] By "homolog" is meant a protein having at least about 35 %, at least 
about 40%, at least about 60%, at least about 70%, at least about 75%, at least about 
80%, at least about 85%, at least about 90%, or at least about 95%, or higher, amino 
acid sequence identity to the reference polypeptide, as measured with the "GAP" 
program (part of the Wisconsin Sequence Analysis Package available through the 
Genetics Computer Group, Inc. (Madison WI)), where the parameters are: Gap 
weight: 12; length weight:4. In many embodiments of interest, homology will be at 
least about 75%, at least about 80%, or at least 85%, where in certain embodiments of 
interest, homology will be as high as about 90%. 

[0441] Also provided are polypeptides that are substantially identical to the 
at least one amino acid sequence shown in the Sequence Listing, or a fragment 
thereof, whereby substantially identical is meant that the protein has an amino acid 
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sequence identity to the reference sequence of at least about 75%, at least about 80%, 
at least about S5%, at least about 90%, at least about 95%, at least about 97%, at least 
about 98%, or at least about 99%. 

[0442] The proteins of the subject invention (e.g., polypeptides encoded by 
the nucleotide sequences shown in SEQ ID NOS.: 1 - 104 have been separated from 
their naturally occurring environment and are present in a non-naturally occurring 
environment. In certain embodiments, the proteins are present in a composition 
where they are more concentrated than in their naturally occurring environment. For 
example, purified polypeptides are provided. 

[0443] In addition to naturally occurring proteins, polypeptides that vary 
from naturally occurring forms are also provided. Fusion proteins can comprise a 
subject polypeptide, or fragment thereof, and a polypeptide other than a subject 
polypeptide ("the fusion partner") fused in-frame at the N-terminus and/or C-tenninus 
of the subject polypeptide, or internally to the subject polypeptide. 

[0444] Suitable fusion partners include, but are not limited to, 
immunologically detectable proteins (e.g., epitope tags, such as hemagglutinin, 
FLAG, and c-myc); polypeptides that provide a detectable signal or that serve as 
detectable markers (e.g., a fluorescent protein, e.g., a green fluorescent protein, a 
fluorescent protein from an Anthozoan species; P-galactosidase; luciferase; ere 
recombinase; and the like); polypeptides that provide a catalytic function or induce a 
cellular response; polypeptides that provide for secretion of the fusion protein from a 
eukaryotic cell; polypeptides that provide for secretion of the fusion protein from a 
prokaryotic cell; polypeptides that provide for binding to metal ions (e.g., His n , where 
n = 3-10, e.g., 6His) and structural proteins. Fusion partners can also be those that are 
able to stabilize the present polypeptide, such as polyethylene glycol ("PEG") and a 
fragment of an immunoglobulin, such as the Fc fragment of IgG, IgE, IgA, IgM, 
and/or IgD. 

[0445] Detection methods are chosen based on the detectable fusion partner. 
For example, where the fusion partner provides an immunologically recognizable 
epitope, an epitope-specific antibody can be used to quantitatively detect the level of 
polypeptide. In some embodiments, the fusion partner provides a detectable signal, 
and in these embodiments, the detection method is chosen based on the type of signal 
generated by the fusion partner. For example, where the fusion partner is a 
fluorescent protein, fluorescence is measured. 
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[0446] Where the fusion partner is an enzyme that yields a detectable 
product, the product can be detected using an appropriate means. For example, |3- 
galactosidase can, depending on the substrate, yield a colored product that can be 
detected with a spectrophotometer, and the fluorescent protein luciferase can yield a 
luminescent product detectable with a luminometer. 

[0447] In some embodiments, a polypeptide of the invention comprises at 
least about 5, at least about 8, at least about 10, at least about 15, at least about 18, at 
least about 20, at lea^t about 25, at least about 30, at least about 50, at least about 75, 
at least about 100, at least about 150, at least about 200, at least about 250, at least 
about 300, at least about 350, at least about 400, at least about 450, at least about 500, 
at least about 550, at least about 600, at least about 650, at least about 700, at least 
about 750, at least about 800, at least about 850, at least about 900, at least about 950, 
or at least about 1000 contiguous amino acid residues of at least one of the sequences 
according to SEQ ID NOS.: 1-104, up to and including the entire amino acid 
sequence. 

[0448] Fragments of the subject polypeptides, as well as polypeptides 
comprising such fragments, are also provided. Fragments of polypeptides of interest 
will typically be at least about 5, at least about 8, at least about 10, at least about 15, at 
least about 18, at least about 20, at least about 25, at least about 30, at least about 50, 
at least about 75, at least about 100, at least about 150, at least about 200, at least 
about 250, or at least 300 aa in length or longer, where the fragment will have a 
stretch of amino acids that is identical to the subject protein of at least about 5, at least 
about 8, at least about 10, at least about 15, at least about 18, at least about 20, at least 
about 25, at least about 30, or at least about 50 aa in length. 

[0449] In some embodiments, fragments exhibit one or more activities 
associated with a corresponding naturally occurring polypeptide. Fragments find 
utility in generating antibodies to the full-length polypeptide; and in methods of 
screening for candidate agents that bind to and/or modulate polypeptide activity. 
Specific fragments of interest include those with enzymatic activity, those with 
biological activity including the ability to serve as an epitope or immunogen, and 
fragments that bind to other proteins or to nucleic acids. 

[0450] The invention provides polypeptides comprising such fragments, 
including, e.g., fusion polypeptides comprising a subject polypeptide fragment fused 
in frame (directly or indirectly) to another protein (the "fusion partner"), such as the 
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signal peptide of one protein being fused to the mature polypeptide of another protein. 
Such fusion proteins are typically made by linking the encoding polynucleotides 
together in a vector or cassette. Suitable fusion partners include, but are not limited 
to, immunologically detectable proteins (e.g., epitope tags, such as hemagglutinin, 
FLAG, and c-myc); polypeptides that provide a detectable signal or that serve as 
detectable markers (e.g., a fluorescent protein, e.g., a green fluorescent protein, a 
fluorescent protein from an Anthozoan species; p-galactosidase; luciferase; ere 
recombinase); polypeptides that provide a catalytic function or induce a cellular 
response; polypeptides that provide for secretion of the fusion protein from a 
eukaryotic cell; polypeptides that provide for secretion of the fusion protein from a 
prokaryotic cell; polypeptides that provide for binding to metal ions (e.g., His„, where 
n = 3-10, e.g., 6His) and structural proteins. Fusion partners can also be those that are 
able to stabilize the present polypeptide, such as polyethylene glycol ("PEG") and a 
fragment of an immunoglobulin, such as the Fc fragment of IgG, IgE, IgA, IgM, 

and/or IgD. 

Polypeptide Preparation. 

[045 1 ] Polypeptides of the invention can be obtained from naturally- 
occurring sources or produced synthetically. The sources of naturally occurring 
polypeptides will generally depend on the species from which the protein is to be 
derived, i.e., the proteins will be derived from biological sources that express the 
proteins. The subject proteins can also be derived from synthetic means, e.g., by 
expressing a recombinant gene encoding a protein of interest in a suitable system or 
host or enhancing endogenous expression, as described in more detail above. Further, 
small peptides can be synthesized in the laboratory by techniques well known in the 
art. 

[0452] In all cases, the product can be recovered by any appropriate means 
known in the art. For example, convenient protein purification procedures can be 
employed (e.g., see Guide to Protein Purification. Deuthscher et al., 1990). That is, a 
lysate can be prepared from the original source, (e.g., a cell expressing endogenous 
polypeptide, or a cell comprising the expression vector expressing the polypeptide(s)), 
and purified using HPLC, exclusion chromatography, gel electrophoresis, or affinity 
chromatography, and the like. 

[045 3] The invention thus also provides methods of producing polypeptides. 
Briefly, the methods generally involve introducing a nucleic acid construct into a host 
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cell in vitj-o and culturing the host cell under conditions suitable for expression, then 
harvesting the polypeptide, either from the culture medium or from the host cell, (e.g., 
by disrupting the host cell), or both, as described in detail above. The invention also 
provides methods of producing a polypeptide using cell-free in vitro 
transcription/translation methods, which are well known in the art, also as provided 
above 

[0454} Moreover, the invention provides polypeptides, including polypeptide 
fragments, as targets for therapeutic intervention, including use in screening assays, 
for identifying agents, that modulate polypeptide level and/or activity, and as targets 
for antibody and small molecule therapeutics; for example, in the treatment of 
disorders. 
Methods 

[0455] The present invention provides methods of producing a subject 
polypeptide and provides antibodies that specifically bind to a subject polypeptide. 
The present invention further provides screening methods for identifying agents that 
modulate a level or an activity of a subject polypeptide or polynucleotide. The 
present invention thus also provides agents that modulate a level or an activity of a 
subject polypeptide or polynucleotide, as well as compositions, including 
pharmaceutical compositions, comprising a subject agent. 

[0456] The present invention further provides methods for treating disorders 
such as, for example, cancer and other proliferative disorders or conditions, 
inflammatory and immune disorders, metabolic disorders or conditions and bacterial 
or viral disorders or conditions. 

Diagnostic and Therapeutic Applications 

Screening and Diagnostic Methods 

L Identifying Biological Molecules that Interact with a Polypeptide 
[0457] Formation of a binding complex between a subject polypeptide and 
an interacting polypeptide or other macromolecule (e.g., DNA, RNA, lipids, 
polysaccharides, and the like) can be detected using any known method. Suitable 
methods include: a yeast two-hybrid system (Zhu et al., 1997; Fields and Song, 1989; 
U.S. Pat. No. 5,283,173; Chien et al. 1991); a mammalian cell two-hybrid method; a 
fluorescence resonance energy transfer (FRET) assay; a bioluminescence resonance 
energy transfer (BRET) assay; a fluorescence quenching assay; a fluorescence 
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anisotropy assay (Jameson and Sawyer, 1995); an immunological assay; and an assay 
involving binding of a detectably labeled protein to an immobilized protein. 

[0458] Immunological assays, and assays involving binding of a detectably 
labeled protein to an immobilized protein can be performed in a variety of ways. For 
example, immunoprecipitation assays can be designed such that the complex of 
protein and an interacting polypeptide is detected by precipitation with an antibody 
specific for either the protein or the interacting polypeptide. 

[0459] FRET detects formation of a binding complex between a subject 
polypeptide and an interacting polypeptide. It involves the transfer of energy from a 
donor fluorophore in an excited state to a nearby acceptor fluorophore. For this 
transfer to take place, the donor and acceptor molecules must be in close proximity 
(e.g., less than 10 nanometers apart, usually between 10 and 100 A apart), and the 
emission spectra of the donor fluorophore must overlap the excitation spectra of the 
acceptor fluorophore. In these embodiments, a fluorescently labeled subject protein 
serves as a donor and/or acceptor in combination with a second fluorescent protein or 
dye. 

[0460] Fluorescent proteins can be produced by generating a construct 
comprising a protein and a fluorescent fusion partner. These are well-known in the 
art, as described above, including green fluorescent protein (GFP), i.e., a "humanized" 
version of a GFP, e.g., wherein codons of the naturally-occurring nucleotide sequence 
are changed to more closely match human codon bias; a GFP derived from Aequoria 
victoria or a derivative thereof, e.g., a "humanized" derivative such as Enhanced GFP, 
which are available commercially, e.g., from Clontech, Inc.; other fluorescent mutants 
of a GFP from Aequoria victoria, e.g., as described in U.S. Patent No. 6,066,476; 
6,020,192; 5,985,577; 5,976,796; 5,968,750; 5,968,738; 5,958,713; 5,919,445; 
5,874,304; a GFP from another species such as Renilla reniformis, Renilla mulleri, or 
Ptilosarcus guemyi, as previously described (WO 99/49019; Peelle et al., 2001), 
"humanized" recombinant GFP (hrGFP) (Stratagene®); any of a variety of fluorescent 
and colored proteins from Anthozoan species, (e.g., Mate et al., 1999); as well as 
proteins labeled with other fluorescent dyes, fluorescein and it derivatives, e.g., 
fluorescein isothiocyanate (FITC), 6-carboxyfluorescein (6-FAM), 6-carboxy- 
2',4',7\4,7-hexacWorofluorescein (HEX), 5-carboxyfluorescein (5-FAM), 
2',7'-dimethoxy-4',5 -dichloro-6-carboxyfluorescein (JOE); rhodamine dyes, e.g., 
Texas red, phycoerythrin, tetramethylrhodamine, rhodamine, 6-carboxy-X-rhodamine 
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(ROX); coumarin and its derivatives, e.g., 7-amino-4-methylcoumarin 5 
aminocoumarin; bodipy dyes, such as Bodipy FL; cascade blue; Oregon green; eosins 
and erythrosins; cyanihe dyes, e.g., allophycocyanin, Cy3, Cy5, andN,N,N',N- 
tetramethyl-6-carboxyrhodamine (TAMRA); macrocyclic chelates of lanthanide ions, 
e.g., quantum dye, etc; and chemiluminescent molecules, e.g., luciferases. 

[046 1 ] Fluorescent subject proteins can also be generated by producing the 

subject protein in an auxotrophic strain of bacteria which requires addition of one or 

...» ( 

more amino, acids in .the medium for growth. A subject protein-encoding construct 
that provides for expression in bacterial cells is introduced into the auxotrophic strain, 
and the bacteria are cultured in the presence of a fluorescent amino acid, which is 
incorporated into the subject protein produced by the bacterium. The subject protein 
is then purified from the bacterial culture using standard methods for protein 
purification. 

[0462] BRET is a protein-protein interaction assay based on energy transfer 
from a bioluminescent donor to a fluorescent acceptor protein. The BRET signal is 
measured by the ratio of the amount of light emitted by the acceptor to the amount of 
light emitted by the donor. The ratio of these two values increases as the two proteins 
are brought into proximity. The BRET assay has been described in the literature 
(U.S. Patent Nos. 6,020,192; 5,968,750; 5,874,304; Xu, et al. 1999). BRET assays 
can be performed by analyzing transfer between a bioluminescent donor protein and a 
fluorescent acceptor protein. Interaction between the donor and acceptor proteins can 
be monitored by a change in the ratio of light emitted by the bioluminescent and 
fluorescent proteins. In this application, the subject protein serves as donor and/or 
acceptor protein. ( 

[0463] Fluorescence anisotropy is a measurement of the rotational mobility 
of a multi-molecular complex. It can be used to generate information about the 
binding of one molecule to another, including the affinity and specificity of binding 
sites. It can be applied to polypeptides or nucleic acids of the present invention. 

[0464] Fluorescence quenching measurements are useful in detecting protein 
multimerization, such as where the subject protein interacts with at least a second 
protein and, for example, where multimerization interaction is affected by a test agent. 
As used herein, the term "multimerization" refers to formation of dimers, trimers, 
tetramers, and higher multimers of the subject protein. Whether a subject protein 
forms a complex with one or more additional protein molecules can be determined 
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using any known assay, including assays as described above for interacting proteins. 
Formation of multimers can also be detected using non-denaturing gel 
electrophoresis, where multimerized subject protein migrates more slowly than 
monomelic subject protein. Formation of multimers can also be detected using 
fluorescence quenching techniques. 

[0465] Formation of multimers can also be detected by analytical 
ultracentrifugation, for example through glycerol or sucrose gradients, and subsequent 
visualization of a subject protein in gradient fractions by Western blotting or staining 
of SDS-polyacrylamide gels. Multimers are expected to sediment at defined positions 
in such gradients. Formation of multimers can also be detected using analytical gel 
filtration, e.g., in HPLC or FPLC systems, e.g., on columns such as Superdex 200 
(Pharmacia Amersham Inc.). Multimers run at defined positions on these columns, 
and fractions can be analyzed as above. The columns are highly 
reproducible, allowing one to relate the number and position of peaks directly to the 
multimerization status of the protein. 

2. Detecting mRNA Levels and Monitoring Gene Expression 
[0466] The present invention provides methods for detecting the presence of 
mRNA in a biological sample. The methods can be used, for example, to assess 
whether a test compound affects gene expression, either directly or indirectly. The 
present invention provides diagnostic methods to compare the abundance of a nucleic 
acid with that of a control value, either qualitatively or quantitatively, and to relate the 
value to a normal or abnormal expression pattern. 

[0467] Methods of measuring mRNA levels are known in the art (Pietu, 
1996; Zhao, 1995; Soares, 1997; Raval, 1994; Chalifour, 1994; Stolz, 1996; Hong, 
1982; McGraw, 1984; WO 97/27317). These methods generally comprise contacting 
a sample with a polynucleotide of the invention under conditions that allow 
hybridization and detecting hybridization, if any, as an indication of the presence of 
the polynucleotide of interest. Appropriate controls include the use of a sample 
lacking the polynucleotide mRNA of interest, or the use of a labeled polynucleotide of 
the same "sense" as a polynucleotide mRNA of interest. Detection can be 
accomplished by any known method, including, but not limited to, in situ 
hybridization, PCR, RT-PCR, and "Northern" or RNA blotting, or combinations of 
such techniques, using a suitably labeled subject polynucleotide. A variety of labels 
and labeling methods for polynucleotides are known in the art and can be used in the 
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assay methods of the invention. A common method employed is use of microarrays 
which can be purchased or customized, for example, through conventional vendors 
such as Asymetrix. 

[0468] In some embodiments, the methods involve generating a cDNA copy 
of an mRNA molecule in a biological sample, and amplifying the cDNA using an 
isolated primer pairs as described above, i.e., a set of two nucleic acid molecules that 
serve as forward and reverse primers in an amplification reaction (e.g., a polymerase 
chain reaction). The primer pairs are chosen to specifically amplify a cDNA copy of 
an mRNA encoding a polypeptide. A detectable label can be included in the 
amplification reaction, as provided above. Methods using PCR amplification can be 
performed on the DNA from a single cell, although it is convenient to use at least 
about 10 5 cells. 

[0469] The present invention provides methods for monitoring gene 
expression. Changes in a promoter or enhancer sequence that can affect gene 
expression can be examined in light of expression levels of the normal allele by 
various methods known in the art. Methods for determining promoter or enhancer 
strength include quantifying the expressed natural protein, and inserting the variant 
control element into a vector with a quantitative reporter gene such as |3-galactosidase, 
luciferase, or chloramphenicol acetyltransferase (CAT). 
3. Detecting Polymorphisms and Mutations 
[0470] Biochemical studies can determine whether a sequence 
polymorphism in a coding region or control region is associated with disease. 
Disease-associated polymorphisms can include deletion or truncation of the gene, 
mutations that alter expression level, or mutations that affect protein function, etc. A 
number of methods are available to analyze nucleic acids for the presence of a 
specific sequence, e.g., a disease associated polymorphism. Genomic DNA can be 
used when large amounts of DNA are available. Alternatively, the region of interest 
is cloned into a suitable vector and grown in sufficient quantity for analysis. Cells 
that express the gene provide a source of mRNA, which can be assayed directly or 
reverse transcribed into cDNA for analysis. The nucleic acid can be amplified by 
conventional techniques, i.e., PCR, to provide sufficient amounts for analysis. (Saiki 
et al., 1988; Sambrook et al., 1989, pp.14.2-14.33). Alternatively, various methods 
are known in the art that utilize oligonucleotide ligation as a means of detecting 
polymorphisms (Riley et al., 1990; Delahunty et al., 1996). 
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[0471] The sample nucleic acid, e.g., an amplified or cloned fragment, is 
analyzed by one of a number of methods known in the art. The nucleic acid can be 
sequenced by dideoxy nucleotide sequencing, or other methods, and the sequence of 
bases compared to a wild-type sequence. Hybridization with the variant sequence can 
also be used to determine its presence, e.g., by Southern blots, dot blots, etc. The 
hybridization pattern of a control and variant sequence to an array of oligonucleotide 
probes immobilized on a solid support, as described in US Pat. No. 5,445,934, or 
WO 95/35505, can also be used as a means of detecting the presence of variant 
sequences. Single strand conformational polymorphism (SSCP) analysis, denaturing 
gradient gel electrophoresis (DGGE), and heteroduplex analysis in gel matrices can 
detect variation as alterations in electrophoretic mobility resulting from 
conformational changes created by DNA sequence alterations. Alternatively, where a 
polymorphism creates or destroys a recognition site for a restriction endonuclease, the 
sample can be digested with that endonuclease, and the products fractionated 
according to their size to determine whether the fragment was digested. Fractionation 
can be performed by gel or capillary electrophoresis, for example with acrylamide or 
agarose gels. 

[0472] Screening for mutations in a gene can be based on the functional or 
antigenic characteristics of the protein. Protein truncation assays are useful in 
detecting deletions that might affect the biological activity of the protein. Various 
immunoassays designed to detect polymorphisms in proteins can be used in screening. 
Where many diverse genetic mutations lead to a particular disease phenotype, 
functional protein assays have proven to be effective screening tools. The activity of 
the encoded protein can be determined by comparison with the wild-type protein. 
4. Detecting and Monitoring Polypeptide Pr esence and Biological Activity 
[0473] The present invention provides methods for detecting the presence 
and/or biological activity of a subject polypeptide in a biological sample. The assay 
used will be appropriate to the biological activity of the particular polypeptide. Thus, 
e.g., where the biological activity is an enzymatic activity, the method will involve 
contacting the sample with an appropriate substrate, and detecting the product of the 
enzymatic reaction on the substrate, mere the biological activity is binding to a 
second macromolecule, the assay detects protein-protein binding, protein-DNA 
binding, protein-carbohydrate binding, or protein-lipid binding, as appropriate, using 
well known assays. Where the biological activity is signal transduction (e.g., 
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transmission of a signal from outside the cell to inside the cell) or transport, an 
appropriate assay is used, such as measurement of intracellular calcium ion 
concentration, measurement of membrane conductance changes, or measurement of 
intracellular potassium ion concentration. 

[0474] The present invention also provides methods for detecting the 
presence or measuring the level of a normal or abnormal polypeptide in a biological 
sample using a specific antibody. The methods generally comprise contacting the 
sample with a specific antibody and detecting binding between the antibody and 
molecules of the sample. Specific antibody binding, when compared to a suitable 
control, is an indication that a polypeptide of interest is present in the sample. 
Suitable controls include a sample known not to contain the polypeptide, and a sample 
contacted with a non-specific antibody, e.g., an anti-idiotype antibody. 

[0475] A variety of methods to detect specific antibody-antigen interactions 
are known in the art, e.g., standard immunohistological methods, 
immunoprecipitation, enzyme immunoassay, and radioimmunoassay. The specific 
antibody can be detectably labeled, either directly or indirectly, as described at length 
herein, and cells are permeabilized to stain cytoplasmic molecules. Briefly, 
antibodies are added to a cell sample, and incubated for a period of time sufficient to 
allow binding to the epitope, usually at least about 10 minutes. The antibody may be 
labeled with radioisotopes, enzymes, fluorescers, chemiluminescers, or other labels 
for direct detection.. Alternatively, specific-binding pairs may be used, involving, 
e.g., a second stage antibody or reagent that is detectably-labeled, as described above. 
Such reagents and their methods of use are well known in the art 

[0476] Alternatively, a biological sample can be brought into contact with an 
immobilized antibody on a solid support or carrier, such as nitrocellulose, that is 
capable of immobilizing cells, cell particles, or soluble proteins. The antibody can be 
attached (coupled) to an insoluble support, such as a polystyrene plate or a bead. 
After contacting the sample, the support can then be washed with suitable buffers, 
followed by contacting with a detectably-labeled specific antibody. Detection 
methods are known in the art and will be chosen as appropriate to the signal emitted 
by the detectable label. Detection is generally accomplished in comparison to suitable 
controls, and to appropriate standards. 

[0477] The present invention further provides methods for detecting the 
presence and/or levels of enzymatic activity of a subject polypeptide in a biological 
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sample. The methods generally involve contacting the sample with a substrate that 
yields a detectable product upon being acted upon by a subject polypeptide, and 
detecting a product of the enzymatic reaction. Further, polypeptides that are subsets 
of the complete sequences of the subject proteins may be used to identify and 
investigate parts of the protein important for function. 

[0478] The present invention further includes methods for monitoring 
activity of a polypeptide through observation of phenotypic changes in a cell 
containing such polypeptide, such as growth or differentiation, or the ability of such a 
cell to secrete a molecule that can be detected, such as through chemical methods or 
through its effect on another cell, such as cell activation. 

5. Modulating mRNA and Peptides in Biological Samples 
[0479] The present invention provides screening methods for identifying 
agents that modulate the level of a mRNA molecule of the invention, agents that 
modulate the level of a polypeptide of the invention, and agents that modulate the 
biological activity of a polypeptide of the invention. In some embodiments/the assay 
is cell-free; in others, it is cell-based. Where the screening assay is a binding assay, 
one or more of the molecules can be joined to a label, where the label can directly or 
indirectly provide a detectable signal. 

[0480] As discussed above, the invention encompasses endogenous 
polynucleotides of the invention that encode mRNA and/or polypeptides of interest. 
Again as discussed previously, the invention also encompasses exogenous 
polynucleotides that encode mRNA or polypeptides of the invention. For example, 
the polynucleotide can reside within a recombinant vector which is introduced into the 
cell. For example, a recombinant vector can comprise an isolated transcriptional 
regulatory sequence which is associated in nature with a nucleic acid, such as a 
promoter sequence operably linked to sequences coding for a polypeptide of the 
invention; or the transcriptional control sequences can be operably linked to coding 
sequences for a polypeptide fusion protein comprising a polypeptide of the invention 
fused to a polypeptide that facilitates detection. 

[0481] In these embodiments, the candidate agent is combined with a cell 
possessing a polynucleotide transcriptional regulatory element operably linked to a 
polypeptide-coding sequence of interest, e.g., a subject cDNA or its genomic 
component; and determining the agent's effect on polynucleotide expression, as 
measured, for example by the level of mRNA, polypeptide, or fusion polypeptide 
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[0482] In other embodiments, for example, a recombinant vector can 
comprise an isolated polynucleotide transcriptional regulatory sequence, such as a 
promoter sequence, operably linked to a reporter gene (e.g., p-galactosidase, CAT, 
luciferase, or other gene that can be easily assayed for expression). In these 
embodiments, the method for identifying an agent that modulates a level of 
expression of a polynucleotide in a cell comprises combining a candidate agent with a 
cell comprising a transcriptional regulatory element operably linked to a reporter 
gene; and determining the effect of said agent on reporter gene expression. 

[0483] Known methods of measuring mRNA levels can be used to identify 
agents that modulate mRNA levels, including, but not limited to, PCR with 
detectably-labeled primers. Similarly, agents that modulate polypeptide levels can be 
identified using standard methods for determining polypeptide levels, including, but 
not limited to an immunoassay such as ELISA with detectably-labeled antibodies. 

[0484] . A wide variety of cell-based assays can also be used to identify 
agents that modulate eukaryotic or prokaryotic mRNA and/or polypeptide levels. 
Examples include transformed cells that over-express a cDNA construct and cells 
transformed with a polynucleotide of interest associated with an endogenously- 
associated promoter operably linked to a reporter gene. A control sample would 
comprise, for example, the same cell lacking the candidate agent. Expression levels 
are measured and compared in the test and control samples. 

[0485] The cells used in the assay are usually mammalian cells, including, 
but not limited to, rodent cells and human cells. The cells can be primary cell cultures 
or can be immortalized cell lines. Cell-based assays generally comprise the steps of 
contacting the cell with a test agent, forming a test sample, and, after a suitable time, 
assessing (he agent's effect on macromolecule expression. That is, the mammalian 
cell line is transformed or transfected with a construct that results in expression of the 
polynucleotide, the cell is contacted with a test agent, and then mRNA or polypeptide 
levels are detected and measured using conventional assays 

[0486] A suitable period of time for contacting the agent with the cell can be 
determined empirically, and is generally a time sufficient to allow entry of the agent 
into the cell and to allow the agent to have a measurable effect on subject mRNA 
and/or polypeptide levels. Generally, a suitable time is between about 10 minutes and 
about 24 hours, including about 1 to about 8 hours. Alternatively, incubation periods 
may be between about 0.1 and about 1 hour, selected for example for optimum 
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activity or to facilitate rapid high-throughput screening. Where the polypeptide is 
expressed on the cell surface, however, a shorter length of time may be sufficient. 
Incubations are performed at any suitable temperature, i.e., between about 4°C and 
about 40°C. The contact and incubation steps can be followed by a washing step to 
remove unbound components, i.e., a label that would give rise to a background signal 
during subsequent detection of specifically-bound complexes. 

[0487] A variety of assay configurations and protocols are known in the art. 
For example, one of the components can be bound to a solid support, and the 
remaining components contacted with the support bound component. Remaining 
components may be added at different times or at substantially the same time. 
Further, where the interacting protein is a second subject protein, the effect of the test 
agent on binding can be determined by determining the effect on multimization of the 
subject protein. 

[0488] The present invention further provides methods of identifying agents 
that modulate a biological activity of a polypeptide of the invention. The method 
generally comprises contacting a test agent with a sample containing a subject 
polypeptide and assaying a biological activity of the subject polypeptide in the 
presence of the test agent. An increase or a decrease in the assayed biological activity 
in comparison to the activity in a suitable control (e.g., a sample comprising a subject 
polypeptide in the absence of the test agent) is an indication that the substance 
modulates a biological activity of the subject polypeptide. The mixture of 
components is added in any order that provides for the requisite interaction.. 

[0489] External and internal processes that can affect modulation of a 
macromolecule of the invention include, but are not limited to, infection of a cell by a 
microorganism, including, but not limited to, a bacterium (e.g., Mycobacterium spp., 
Shigella, or Chlamydia), a protozoan (e.g., Trypanosoma spp., Plasmodium spp., or 
Toxoplasma spp.), a fungus, a yeast (e.g., Candida spp.), or a virus (including viruses 
that infect mammalian cells, such as human immunodeficiency virus, foot and mouth 
disease virus, Epstein-Barr virus, and viruses that infect plant cells); change in pH of 
the medium in Which a cell is maintained or a change in internal pH; excessive heat 
relative to the normal range for the cell or the multicellular organism; excessive cold 
relative to the normal range for the cell or the multicellular organism; an effector 
molecule such as a hormone, a cytokine, a chemokine, a neurotransmitter; an ingested 
or applied drug; a ligand for a cell-surface receptor; a ligand for a receptor that exists 
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internally in a cell, e.g., a nuclear receptor; hypoxia; light; dark; sleep patterns; 
electrical charge; ion concentration of the medium in which a cell is maintained or an 
internal ion concentration, exemplary ions including sodium ions, potassium ions, 
chloride ions, calcium ions, and the like; presence or absence of a nutrient; metal ions; 
a transcription factor; mitogens, including, but not limited to, lipopolysaccharide 
(LPS), pokeweed mitogen; antigens; a tumor suppressor; and cell-cell contact and 
must be taken into consideration in the screening assay. 

[0490] A variety of other reagents can be included in the screening assay. 
These include salts, neutral proteins, e.g., albumin, detergents, and other compounds 
that facilitate optimal binding and/or reduce non-specific or background interactions. 
Reagents that improve the efficiency of the assay, such as protease inhibitors, 
nuclease inhibitors, or anti-microbial agents, etc., can be used. 

[0491] Accordingly, the present invention provides a method for identifying 
an agent, particularly a biologically active agent that modulates the level of 
expression of a nucleic acid in a cell, the method comprising: combining a candidate 
agent to be tested with a cell comprising a nucleic acid that encodes a polypeptide, 
and determining the agent's effect on polypeptide expression. 

[0492] Some embodiments will detect agents that decrease the biological 
activity of a molecule of the invention. Maximal inhibition of the activity is not 
always necessary, or even desired, in every instance to achieve a therapeutic effect. 
Agents that decrease a biological activity can find use in treating disorders associated 
with the biological activity of the molecule. Alternatively, some embodiments will 
detect agents that increase a biological activity. Agents that increase a biological 
activity of a molecule of the invention can find use in treating disorders associated 
with a deficiency in the biological activity. Agents that increase or decrease a 
biological activity of a molecule of the invention can be selected for further study, and 
assessed for physiological attributes, i.e., cellular availability, cytotoxicity, or 
biocompatibility, and optimized as required. For example, a candidate agent is 
assessed for any cytotoxic activity it may exhibit toward the cell used in the assay 
using well-known assays, such as trypan blue dye exclusion, an MTT ([3-(4,5- 
dimethylthiazol-2-yl)-2,5-diphenyl-2 H-tetrazolium bromide]) assay, and the like. 

[0493] A variety of different candidate agents can be screened by the above 
methods. Candidate agents encompass numerous chemical classes, as described 
above. 
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[0494] Candidate agents are obtained from a wide variety of sources 
including libraries of synthetic or natural compounds. Numerous means are available 
for random and directed synthesis of a wide variety of organic compounds and 
biomolecules, including expression of randomized oligonucleotides and 
oligopeptides. For example, random peptide libraries obtained by yeast two-hybrid 
screens (Xu et al., 1997), phage libraries (Hbogenboom et al., 1998), or chemically 
generated libraries. Alternatively, libraries of natural compounds in the form of 
bacterial, fungal, plant and animal extracts are available or readily produced, 
including antibodies produced upon immunization of an animal with subject 
polypeptides, or fragments thereof, or with the encoding polynucleotides. 
Additionally, natural or synthetically produced libraries and compounds are readily 
modified through conventional chemical, physical and biochemical means, and can be 
used to produce combinatorial libraries. Further, known pharmacological agents can 
be subjected to directed or random chemical modifications, such as acylation, 
alkylation, esterification, and amidification, etc, to produce structural analogs. 
6. Kits 

[0495] The present invention provides methods for diagnosing disease-states 
based on the detected presence and/or level of polynucleotide or polypeptide in a 
biological sample, and/or the detected presence and/or level of biological activity of 
the polynucleotide or polypeptide. These detection methods can be provided as part 
of a kit. Thus, the invention further provides kits for detecting the presence and/or a 
level of a polynucleotide or polypeptide in a biological sample and/or or the detected 
presence and/or level of biological activity of the polynucleotide or polypeptide. 
Procedures using these kits can be performed by clinical laboratories, experimental 
laboratories, medical practitioners, or private individuals. 

[0496] The kits of the invention will comprise a molecule of the invention. 
The kits for detecting a polynucleotide will also comprise a moiety that specifically 
hybridizes to a polynucleotide of the invention. The polynucleotide molecule can be 
of any length. For example, it can comprise a polynucleotide of at least 6, at least 7, 
at least 8, or at least 9 contiguous nucleotides of a molecule of the invention. Kits of 
the invention for detecting a subject polypeptide will comprise a moiety that 
specifically binds to a polypeptide of the invention; the moiety includes, but is not 
limited to, a polypeptide-specific antibody. 
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[0497] The kits are useful in diagnostic applications. For example, the kit is 
useful to determine whether a given DNA sample isolated from an individual 
comprises an expressed nucleic acid, a polymorphism, or other variant. 

[0498] Kits for detecting polynucleotides comprise a pair of nucleic acids in 
a suitable storage medium, e.g., a buffered solution, in a suitable container. The pair 
of isolated nucleic acid molecules serve as primers in an amplification reaction (e.g., a 
polymerase chain reaction). The kit can further include additional buffers, reagents 
for polymerase chain reaction (e.g., deoxynucleotide triphosphates (dNTP), a 
thermostable DNA polymerase, a solution containing Mg 2+ ions (e.g., MgCl 2 ), and 
other components well known to those skilled in the art for carrying out a polymerase 
chain reaction). The kit can further include instructions for use, which may be 
provided in a variety of forms, e.g., printed information, or compact disc, and the like. 
The kit may further include reagents necessary to extract DNA from a biological 
sample and reagents for generating a cDNA copy of an mRNA. The kit may 
optionally provide additional useful components, including, but not limited to, 
buffers, developing reagents, labels, reacting surfaces, means for detections, control 
samples, standards, and. interpretive information. 

[0499] In some embodiments, a kit of the invention for detecting a 
polynucleotide, such as an mRNA encoding a polypeptide, comprises a pair of nucleic 
acids that function as "forward" and "reverse" primers that specifically amplify a 
cDNA copy of the mRNA. The "forward" and "reverse" primers are provided as a 
pair of isolated nucleic acid molecules, each from about 10 to about 200 nucleotides 
in length, the first nucleic acid molecule of the pair comprising a sequence of at least 
about 10 contiguous nucleotides having 100% sequence identity to a nucleic acid 
sequence shown in from SEQ ID NOS.: 1-104, and the second nucleic acid molecule 
of the pair comprising a sequence of at least about 10 contiguous nucleotides having 
100% sequence identity to the reverse complement of a nucleic acid sequence shown 
in SEQ ID NOS.: 1 - 104, wherein the sequence of the second nucleic acid molecule 
is located 3 9 of the nucleic acid sequence of the first nucleic acid molecule. The 
primer nucleic acids are prepared using any known method, e.g., automated synthesis. 
In some embodiments, one or both members of the pair of nucleic acid molecules 
comprise a detectable label. 

[0500] Where the kit provides for polypeptide detection, it can include one 
or more specific antibodies. In some embodiments, the antibody specific to the 
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polypeptide is detectably labeled. In other embodiments* the antibody specific- to the 
polypeptide is not labeled; instead, a second, detectably-labeled antibody is provided 
that binds to the specific antibody. The kit may further include blocking reagents, 
buffers, and reagents for developing and/or detecting the detectable marker. The kit 
may further include instructions for use, controls, and interpretive information. 

[0501 ] Where the kit provides for detecting enzymatic activity, it includes a 
substrate that provides for a detectable product when acted upon by a polypeptide of 
interest. The kit may further include reagents necessary to detect and develop the 
detectable marker. 

[0502] The present invention provides for kits with unit doses of an active 
agent. These agents are described in more detail below. In some embodiments, the 
agent is provided in oral or injectable doses. Such kits will comprise containers 
containing the unit doses and an informational package insert describing the use and 
attendant benefits of the drugs in treating a condition of interest. 

Therapeutic Compositions 

[0503] The invention further provides agents identified using a screening 
assay of the invention, and compositions comprising the agents, subject polypeptides, 
subject polynucleotides, recombinant vectors, and/or host cells, including 
pharmaceutical compositions for therapeutic administration. The subject 
compositions can be formulated using well-known reagents and methods. These 
compositions can include a buffer, which is selected according to the desired use of 
the agent, polypeptide, polynucleotide, recombinant vector, or host cell, and can also 
include other substances appropriate to the intended use. Those skilled in the art can 
readily select an appropriate buffer, a wide variety of which are known in the art, 
suitable for an intended use. 

L Excipients and Formulations 

[0504] In some embodiments, compositions are provided in formulation with 
pharmaceutical^ acceptable excipients, a wide variety of which are known in the art 
(Gennaro, 2000; Ansel et al., 1999; Kibbe et al., 2000). Pharmaceutical^ acceptable 
excipients, such as vehicles, adjuvants, carriers or diluents, are readily available to the 
public. Moreover, pharmaceutical^ acceptable auxiliary substances, such as pH 
adjusting and buffering agents, tonicity adjusting agents, stabilizers, wetting agents 
and the like, are readily available to the public. 
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[0505] In pharmaceutical dosage forms, the compositions of the invention 
can be administered in the form of their pharmaceutical^ acceptable salts, or they can 
also be used alone or in appropriate association, as well as in combination, with other 
pharmaceutical^ active compounds. The subject compositions are formulated in 
accordance to the mode of potential administration. Administration of the agents can 
be achieved in various ways, including oral, buccal, nasal, rectal, parenteral, 
intraperitoneal, intradermal, transdermal, subcutaneous, intravenous, intra-arterial, 
intracardiac, intraventricular, intracranial, intratracheal, and intrathecal 
administration, etc., or otherwise by implantation or inhalation. Thus, the subject 
compositions can be formulated into preparations in solid, semi-solid, liquid or 
gaseous forms, such as tablets, capsules, powders, granules, ointments, solutions, 
suppositories, injections, inhalants and aerosols. The following methods and 
excipients are merely exemplary and are in no way limiting. 

[0506] For oral preparations, the agents, polynucleotides, and polypeptides 
can be used alone or in combination with appropriate additives to make tablets, 
powders, granules or capsules, for example, with conventional additives, such as 
lactose, mannitol, com starch, or potato starch; with binders, such as crystalline 
cellulose, cellulose derivatives, acacia, com starch, or gelatins; with disintegrators, 
such as corn starch, potato starch, or sodium carboxymethylcellulose; with lubricants, 
such as talc or magnesium stearate; and if desired, with diluents, buffering agents, 
moistening agents, preservatives, and flavoring agents. 

[0507] Suitable excipient vehicles are, for example, water, saline, dextrose, 
glycerol, ethanol, or the like, and combinations thereof In addition, if desired, the 
vehicle can contain minor amounts of auxiliary substances such as wetting or 
emulsifying agents or pH buffering agents. Actual methods of preparing such dosage 
forms are known, or will be apparent, to those skilled in the art (Remington, 1985). 
The composition or formulation to be administered will, in any event, contain a 
quantity of the agent adequate to achieve the desired state in the subject being treated. 

[0508] The agents, polynucleotides, and polypeptides can be formulated into 
preparations for injection by dissolving, suspending or emulsifying them in an 
aqueous or nonaqueous solvent, such as vegetable or other similar oils, synthetic 
aliphatic acid glycerides, esters of higher aliphatic acids or propylene glycol; and if 
desired, with conventional additives such as solubilizers, isotonic agents, suspending 
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agents, emulsifying agents, stabilizers and preservatives. Other formulations for oral 
or parenteral delivery can also be used, as conventional in the art 

[0509] The agents, polynucleotides, and polypeptides can be utilized in 
aerosol formulation to be administered via inhalation. The compounds of the present 
invention can be formulated into pressurized acceptable propellants such as 
dichlorodifluoromethane, propane, nitrogen, and the like. Further, the agent, 
polynucleotides, or polypeptide composition may be converted to powder form for 
administration intranasally or by inhalation, as conventional in the art. 

[05 1 0] Furthermore, the agents can be made into suppositories by mixing 
with a variety of bases such as emulsifying bases or water-soluble bases. The 
compounds of the present invention can be administered rectally via a suppository. 
The suppository can include vehicles such as cocoa butter, carbowaxes and 
polyethylene glycols, which melt at body temperature, yet are solidified at room 
temperature. 

[05 1 1 ] A polynucleotide, polypeptide, or other modulator, can also be 
introduced into tissues or host cells by other routes, such as viral infection, 
microinjection, or vesicle fusion. For example, expression vectors can be used to 
introduce nucleic acid compositions into a cell as described above. Further, jet 
injection can be used for intramuscular administration (Furth et al., 1992). The DNA 
can be coated onto gold microparticles, and delivered intradermally by a particle 
bombardment device, or "gene gun" as described in the literature (Tang et al., 1992), 
where gold microprojectiles are coated with the DNA, then bombarded into skin cells. 

[05 1 2] Unit dosage forms for oral or rectal administration such as syrups, 
elixirs, and suspensions can be provided wherein each dosage unit, for example, 
teaspdonful, tablespoonful, tablet, or suppository, contains a predetermined amount of 
the composition containing one or more agents. Similarly, unit dosage forms for 
injection or intravenous administration can comprise the agent(s) in a composition as 
a solution in sterile water, normal saline or another pharmaceutically acceptable 
carrier. 

2. Active Agents (or Modulators) 

[05 1 3] The nucleic acid, polypeptide, and modulator compositions of tire 
subject invention find use as therapeutic agents in situations where one wishes to 
modulate an activity of a subject polypeptide in a host, particularly the activity of the 
subject polypeptides, or to provide or inhibit the activity at a particular anatomical 
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site. Thus, the compositions are useful in treating disorders associated with an 
activity of a subject polypeptide. The following provides further details of active 
agents of the present invention. 

a) Antisense Oligonucleotides 

[05 14] In certain embodiments of the invention, the active agent is an agent 
that modulates, and generally decreases or down regulates, the expression of a gene 
encoding a target protein in a host, i.e., antisense molecules. Anti-sense reagents 
include antisense oligonucleotides (ODN), i.e., synthetic ODN having chemical 
modifications from native nucleic acids, or nucleic acid constructs that express such 
anti-sense molecules as RNA. The antisense sequence is complementary to the 
mRNA of the targeted gene, and inhibits expression of the targeted gene products. 
Antisense molecules inhibit gene expression through various mechanisms, e.g., by 
reducing the amount of mRNA available for translation, through activation of RNase 
H, or steric hindrance. One or a combination of antisense molecules can be 
administered, where a combination can comprise multiple different sequences. 

[0515] Antisense molecules can be produced by expression of all or a part of 
the target gene sequence in an appropriate vector, where the transcriptional initiation 
is oriented such that ail antisense strand is produced as an RNA molecule. 
Alternatively, the antisense molecule is a synthetic oligonucleotide. Antisense 
oligonucleotides can be chemically synthesized by methods known in the art (Wagner 
et al., 1993; Milligan et al., 1993) Oligonucleotides can be chemically modified from 
the native phosphodiester structure to increase their intracellular stability and binding 
affinity, for example, as described in detail above. Antisense oligonucleotides will 
generally be at least about 7, at least about 12, or at least about 20 nucleotides in 
length, and not more than about 500, not more than about 50, or not more than about 
35 nucleotides in length, where the length is governed by efficiency of inhibition, and 
specificity, including absence of cross-reactivity, and the like. Short oligonucleotides, 
of from about 7 to about 8 bases in length, can be strong and selective inhibitors of 
gene expression (Wagner et al., 1996). 

[0516] A specific region or regions of the endogenous sense strand mRNA 
sequence is chosen to be complemented by the antisense sequence. Selection of a 
specific sequence for the oligonucleotide can use an empirical method, where several 
candidate sequences are assayed for inhibition of expression of the target gene in an in 
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vitro or animal model. A combination of sequences can also be used, where several 
regions of the mRNA sequence are selected for antisense complementation. 

[05 17] As an alternative to anti-sense inhibitors, catalytic nucleic acid 
compounds, e.g., ribozymes, or anti-sense conjugates can be used to inhibit gene 
expression. Ribozymes can be synthesized in vitro and administered to the patient, or 
can be encoded in an expression vector, from which the ribozyme is synthesized in 
the targeted cell (WO 9523225; Beigelman et al., 1995). Examples of 
oligonucleotides with catalytic activity are described in WO 9506764. Conjugates of 
anti-sense ODN with a metal complex, e.g., terpyridyl Cu(II) 5 capable of mediating 
mRNA hydrolysis are described in Bashkin et al, 1995. 

b) Interfering RNA 

[05 1 8] In some embodiments, the active agent is an interfering RN A 
(RNAi), including dsRNAi. RNA interference provides a method of silencing 
eukaryotic genes. Double stranded RNA can induce the homology-dependent 
degradation of its cognate mRNA in C. elegans, fungi, plants, Drosophila, and 
mammals (Gaudilliere et al. 3 2002). Use of RNAi to reduce a level of a particular 
mRNA and/or protein is based on the interfering properties of double-stranded RNA 
derived from the coding regions of a gene. The technique reduces the time between 
identifying an interesting gene sequence and understanding its function, and thus is an 
efficient high-throughput method for disrupting gene function (ONeil, 2001). RNAi 
can also help identify the biochemical mode of action of a drug and to identify other 
genes encoding products that can respond or interact with specific compounds. 

[05 1 9] In one embodiment of the invention, complementary sense and 
antisense RNAs derived from a substantial portion of the subject polynucleotide are 
synthesized in vitro. The resulting sense and antisense RNAs are annealed in an 
injection buffer, and the double-stranded RNA injected or otherwise introduced into 
the subject, i.e., in food or by immersion in buffer containing the RNA (Gaudilliere et 
al., 2002; ONeil et al., 2001; W099/32619). In another embodiment, dsRNA derived 
from a gene of the present invention is generated in vivo by simultaneously expressing 
both sense and antisense RNA from appropriately positioned promoters operably 
linked to coding sequences in both sense and antisense orientations. 

c) Peptides and Modified Peptides 

[0520] In some embodiments of the present invention, the active agent is a 
peptide. Suitable peptides include peptides of from about 3 amino acids to about 50, 
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from about 5 to about 30, or from about 10 to about 25 amino acids in length. In 
some embodiments, a peptide has a sequence of from about 3 amino acids to about 
50, from about 5 to about 30, or from about 10 to about 25 amino acids of 
corresponding naturally-occurring protein. In some embodiments, a peptide exhibits 
one or more of the following activities: inhibits binding of a subject polypeptide to an 
interacting protein or other molecule; inhibits subject polypeptide binding to a second 
polypeptide molecule; inhibits a signal transduction activity of a subject polypeptide; 
inhibits an enzymatic activity of a subject polypeptide; or inhibits a DNA binding 
activity of a subject polypeptide. 

[052 1 ] Peptides can include naturally-occurring and non-naturally occurring 
amino acids. Peptides can comprise D-amino acids, a combination of D- and L-amino 
acids, and various "designer" amino acids (e.g., P-methyl amino acids, Ca-methyl 
amino acids, and Na-methyl amino acids, etc.) to convey special properties. 
Additionally, peptides can be cyclic. Peptides can include non-classical amino acids 
in order to introduce particular conformational motifs. Any known non-classical 
amino acid can be used. Non-classical amino acids include, but are not limited to, 
1 ,2,3,4-tetrahydroisoquinoline-3-carboxylate; (2S,3S)-methylphenylalanine, (2S,3R)- 
methyl-phenylalanine, (2R,3S)-methyl-phenylalanine and (2R,3R)-methyl- 
phenylalanine; 2-aminotetrahydronaphthalene-2-carboxylic acid; hydroxy- 1,2,3,4- 
tetrahydroisoquinoline-3-carboxylate; P-carboline (D and L); HIC (histidine 
isoquinoline carboxylic acid); and HIC (histidine cyclic urea). Amino acid analogs 
and peptidomimetics can be incorporated into a peptide to induce or favor specific 
secondary structures, including, but not limited to, LL-Acp (LL-3-amino-2- 
propenidone-6-carboxylic acid), a P-turn inducing dipeptide analog; P-sheet inducing 
analogs; P-turn inducing analogs; a-helix inducing analogs; y-turn inducing analogs; 
Gly-Ala turn analogs; amide bond isostere; or tretrazol, and the like. 

[0522] A.peptide can be a depsipeptide, which can be linear or cyclic (Kuisle 
et al., 1999). Linear depsipeptides can comprise rings formed through S-S bridges, or 
through an hydroxy or a mercapto group of an hydroxy-, or mercapto-amino acid and 
the carboxyl group of another amino- or hydroxy-acid but do not comprise rings 
formed only through peptide or ester links derived from hydroxy carboxylic acids. 
Cyclic depsipeptides contain at least one ring formed only through peptide or ester 
links, derived from hydroxy carboxylic acids. 
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[0523] Peptides can be cyclic or bicyclic. For example, the C-terminal 
carboxyl group or a C-terminal ester can be induced to cyclize by internal 
displacement of the -OH or the ester (-OR) of the carboxyl group or ester respectively 
with the N-terminal amino group to form a cyclic peptide. For example, after 
synthesis and cleavage to give the peptide acid, the free acid is converted to an 
activated ester by an appropriate carboxyl group activator such as 
dicyclohexylcarbodiimide (DCC) in solution, for example, in methylene chloride 
(CH 2 C1 2 ), dimethyl formamide (DMF) mixtures. The cyclic peptide is then formed by 
internal displacement of the activated ester with the N-terminal amine. Internal 
cyclization as opposed to polymerization can be enhanced by use of very dilute 
solutions. Methods for making cyclic peptides are well known in the art. 

[0524] The term "bicyclic" refers to a peptide with two ring closures formed 
by covalent linkages between amino acids. A covalent linkage between two: 
nonadjacent amino acids constitutes a ring closure, as does a second covalent linkage 
between a pair of adjacent amino acids which are already linked by a covalent peptide 
linkage. The covalent linkages forming the ring closures can be amide linkages, 
i.e., the linkage formed between a free amino on one amino acid and a free carboxyl 
of a second amino acid, or linkages formed between the side chains or "R" groups of 
amino acids in the peptides. Thus, bicyclic peptides can be "true" bicyclic peptides, 
i.e., peptides cyclized by the formation of a peptide bond between the N-terminus and 
the C-terminus of the peptide, or they can be "depsi-bicyclic" peptides, i.e., peptides 
in which the terminal amino acids are covalently linked through their side chain 
moieties. 

[0525] A desamino or descarboxy residue can be incorporated at the terminal 
ends of the peptide, so that there is no terminal amino or carboxyl group, to decrease 
susceptibility to proteases or to restrict conformation. C-terminal functional groups 
include amide, amide lower alkyl, amide di (lower alkyl), lower alkoxy, hydroxy, and 
carboxy, and the lower ester derivatives thereof, and the pharmaceutically acceptable 
salts thereof. 

[0526] In addition to the foregoing N-terminal and C-terminal modifications, 
a peptide or peptidomimetic can be modified with or covalently coupled to one or 
more of a variety of hydrophilic polymers to increase solubility and circulation half- 
life of the peptide. Suitable nonproteinaceous hydrophilic polymers for coupling to a 
peptide include, but are not limited to, polyalkylethers as exemplified by polyethylene 
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glycol and polypropylene glycol, polylactic acid, polyglycolic acid, polyoxyalkenes, 
polyvinylalcohol, polyvinylpyrrolidone, cellulose and cellulose derivatives, dextran, 
and dextran derivatives. Generally, such hydrophilic polymers have an average 
molecular weight ranging from about 500 to about 100,000 daltons, from about 2,000 
to about 40,000 daltons, or from about 5,000 to about 20,000 daltons. The peptide 
can be derivatized with or coupled to such polymers using any of the methods set 
forth in Zallipsky, 1995; Monfardini et aL, 1995; U.S. Pat. Nos. 4,640,835; 4,496,689; 
4,301,144; 4,670,41?; 4,791,192; 4,179,337, or WO 95/34326. 
d) Antibodies 

[0527] The invention provides antibodies that specifically recognize a 
particular polypeptide. Antibodies are obtained by immunizing a host animal with 
peptides, polynucleotides encoding polypeptides, or cells, each comprising all or a 
portion of the target protein ("immunogen"). Suitable host animals include rodents 
(e.g., mouse, rat, guinea pig, hamster), cattle (e.g., sheep, pig, cow, horse, goat), cat, 
dog, chicken, primate, monkey, and rabbit. The origin of the protein immunogen can 
be any species, including mouse, human, rat, monkey, avian, insect, reptile, or 
crustacean. The host animal will generally be a different species than the 
immunogen, e.g., a human protein used to immunize mice. Methods of antibody 
production are well known in the art (Howard and Bethell, 2000; Harlow et aL, 1998; 
Harlow and Lane, 1988). 

[0528] The immunogen can comprise the complete protein, or fragments and 
derivatives thereof, or proteins expressed on cell surfaces. Immunogens comprise all 
or a part of one of the subject proteins, where these amino acids contain post- 
translational modifications, such as glycosylation, found on the native target protein. 
Immunogens comprising protein extracellular domains are produced in a variety of 
ways known in the art, e.g., expression of cloned genes using conventional 
recombinant methods, or isolation from tumor cell culture supernatants, etc. The 
immunogen can also be expressed in vivo from a polynucleotide encoding the 
immunogenic peptide introduced into the host animal. 

[0529] Polyclonal antibodies are prepared by conventional techniques. 
These include immunizing the host animal in vivo with the target protein (or 
immunogen) in substantially pure form, for example, comprising less than about 1% 
contaminant. The immunogen can comprise the complete target protein, fragments, 
or derivatives thereof. To increase the immune response of the host animal, the target 
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protein can be combined with an adjuvant; suitable adjuvants include alum, dextran, 
sulfate, large polymeric anions, and oil & water emulsions, e.g., Freund's adjuvant 
(complete or incomplete). The target protein can also be conjugated to synthetic 
carrier proteins or synthetic antigens. The target protein is administered to the host, 
usually intradermally, with an initial dosage followed by one or more, usually at least 
two, additional booster dosages. Following immunization, blood from the liost will 
be collected, followed by separation of the serum from blood cells. The 
immunoglobulin present in the resultant antiserum can be further fractionated using 
known methods, such as ammonium salt fractionation, or DEAE chromatography and 
the like. 

[0530] The method of producing polyclonal antibodies can be varied in some 
embodiments of the present invention. For example, instead of using a single 
substantially isolated polypeptide as an immunogen, one may inject a number of 
different immunogens into one animal for simultaneous production of a variety of 
antibodies. In addition to protein immunogens, the immunogens can be nucleic acids 
(e.g., in the form of plasmids or vectors) that encode the proteins, with facilitating 
agents, such as liposomes, microspheres, etc, or without such agents, such as "naked" 
DNA. 

[053 1 ] Antibodies can also be prepared using a library approach. Briefly, 
mRNA is extracted from the spleens of immunized animals to isolate antibody- 
encoding sequences. The extracted mRNA may be used to make cDNA libraries. 
Such a cDNA library may be normalized and subtracted in a manner conventional in 
the art, for example, to subtract out cDNA hybridizing to mRNA of non-immunized 
animals. The remaining cDNA may be used to create proteins and for selection of 
antibody molecules or fragments that specifically bind to the immunogen. The cDN A 
clones of interest, or fragments thereof, can be introduced into an in vitro expression 
system to produce the desired antibodies, as described herein. 

[0532] In a further embodiment, polyclonal antibodies can be prepared using 
phage display libraries, conventional in the art. In this method, a collection of 
bacteriophages displaying antibody properties on their surface? are made to contact 
subject polypeptides, or fragments thereof. Bacteriophages displaying antibody 
properties that specifically recognize the subject polypeptides are selected, amplified, 
for example, in E. coli, and harvested. Such a method typically produces single chain 
antibodies 
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[0533] Monoclonal antibodies are also produced by conventional techniques, 
such as fusing an antibody-producing plasma cell with an immortal cell to produce 
hybridomas. .Suitable animals will be used, e.g., to raise antibodies against a mouse 
polypeptide of the invention, the host animal will generally be a hamster, guinea pig, 
goat, chicken, or rabbit, and the like. Generally, the spleen and/or lymph nodes of an 
immunized host animal provide the source of plasma cells, which are immortalized by 
fusion with myeloma cells to produce hybridoma cells. Culture supernatants from 
individual hybridomas are screened using standard techniques to identify clones 
producing antibodies with the desired specificity. The antibody can be purified from 
the hybridoma cell supernatants or from ascites fluid present in the host by 
conventional techniques, e.g., affinity chromatography using antigen, e.g., the subject 
protein, bound to an insoluble support, i.e., protein A sepharose, etc. 

[0534] The antibody can be produced as a single chain, instead of the normal 
multimeric structure of the immunoglobulin molecule. Single chain antibodies have 
been previously described (i.e., Jost et aL, 1994). DNA sequences encoding parts of 
the immunoglobulin, for example, the variable region of the heavy chain and the 
variable region of the light chain are ligated to a spacer, such as one encoding at least 
about four small neutral amino acids, i.e., glycine or serine. The protein encoded by 
this fusion allows the assembly of a functional variable region that retains the 
specificity and affinity of the original antibody. 

[0535] The invention also provides intrabodies that are intracellularly 
expressed single-chain antibody molecules designed to specifically bind and 
inactivate target molecules inside cells. Intrabodies have been used in cell assays and 
in whole organisms (Chen et al., 1994; Hassanzadeh et aL, 1998). Inducible 
expression vectors can be constructed with intrabodies that react specifically with a 
protein of the invention. These vectors can be introduced into host cells and model 
organisms. 

[0536] The invention also provides "artificial" antibodies, e.g., antibodies 
and antibody fragments produced and selected in vitro. In some embodiments, these 
antibodies are displayed on the surface of a bacteriophage or other viral particle, as 
described above. In other embodiments, artificial antibodies are present as fusion 
proteins with a viral or bacteriophage structural protein, including, but not limited to, 
Ml 3 gene III protein. Methods of producing such artificial antibodies are well known 
in the art (U.S. Patent Nos. 5,516,637; 5,223,409; 5,658,727; 5,667,988; 5,498,538; 
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5,403,484; 5,571,698; and 5,625,033). The artificial antibodies, selected for example, 
on the basis of phage binding to selected antigens, can be fused to a Fc fragment of an 
immunoglobulin for use as a therapeutic, as described, for example, in US 5,1 16,964 
or WO 99/61630. Antibodies of the invention can be used to modulate biological 
activity of cells, either directly or indirectly. A subject antibody can modulate the 
activity of a target cell, with which it has primary interaction, or it can modulate the 
activity of other cells by exerting secondary effects, i.e., when the primary targets 
interact or communicate with other cells. The antibodies of the invention can be 
administered to mammals, and the present invention includes such administration, 
particularly for therapeutic and/or diagnostic purposes in humans. 

[0537] Antibodies may be administered by injection systemically, such as by 
intravenous injection; or by injection or application to the relevant site, such as by 
direct injection into a tumor, or direct application to the site when the site is exposed 
in surgery; or by topical application, such as if the disorder is on the skin, for 
example. 

[0538] For in vivo use, particularly for injection into humans, in some 
embodiments it is desirable to decrease the antigenicity of the antibody. An immune 
response of a recipient against the antibody may potentially decrease the period of 
time that the therapy is effective. Methods of humanizing antibodies are known in the 
art. The humanized antibody can be the product of an animal having transgenic 
human immunoglobulin genes, e.g., constant region genes (e.g., Grosveld and Kolias, 
1992; Murphy and Carter, 1993; Pinkert, 1994; and International Patent Applications 
WO 90/10077 and WO 90/04036). Alternatively, the antibody of interest can be 
engineered by recombinant DNA techniques to substitute the CHI, CH2, CH3, hinge 
domains, and/or the framework domain with the corresponding human sequence (see, 
e.g., WO 92/02190). Both polyclonal and monoclonal antibodies made in non-human 
animals may be "humanized" before administration to human subjects. 

[0539] Chimeric immunoglobulin genes constructed with immunoglobulin 
cDNA are known in the art (Liu et al. 1987a; Liu et al. 1987b). Messenger RNA is 
isolated from a hybridoma or other cell producing the antibody and used to produce 
cDNA. The cDNA of interest can be amplified by the polymerase chain reaction 
using specific primers (U.S. Patent nos. 4,683,195 and 4,683,202). Alternatively, a 
library is made and screened to isolate the sequence of interest. The DNA sequence 
encoding the variable region of the antibody is then fused to human constant region 
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sequences. The sequences of human constant regions genes are known in the art 
(Kabat et al., 1991). Human C region genes are readily available from known clones. 
The choice of isotype will be guided by the desired effector functions, such as 
complement fixation, or antibody-dependent cellular cytotoxicity. IgGl, IgG3 and 
IgG4 isotypes, and either of the kappa or lambda human light chain constant regions 
can be used. The chimeric, humanized antibody is then expressed by conventional 
methods. 

[0540] Consensus sequences of heavy ("H") and light ("L") J regions can be 
used to design oligonucleotides for use as primers to introduce useful restriction sites 
into the J region for subsequent linkage of V region segments to human C region 
segments. C region cDNA can be modified by site directed mutagenesis to place a 
restriction site at the analogous position in the human sequence. 

[0541] A convenient expression vector for producing antibodies is one that 
encodes a functionally complete human CH or CL immunoglobulin sequence, with 
appropriate restriction sites engineered so that any VH or VL sequence can be easily 
inserted and expressed, such as plasmids, retroviruses, YACs, or EB V derived 
episomes, and die like. In such vectors, splicing usually occurs between the splice 
donor site in the inserted J region and the splice acceptor site preceding the human C 
region, and also at the splice regions that occur within the human CH exons. 
Polyadenylation and transcription termination occur at native chromosomal sites 
downstream of the coding regions. The resulting chimeric antibody can be joined to 
any strong promoter, including retroviral LTRs, e.g., SV-40 early promoter, 
(Okayama, et al. 1983), Rous sarcoma virus LTR (Gorman et al. 1982), and Moloney 
murine leukemia virus LTR (Grosschedl et al. 1985), or native immunoglobulin 
promoters. 

[0542] In yet other embodiments, the antibodies can be fully human 
antibodies. For example, xenogenic antibodies, which are produced in animals that 
are transgenic for human antibody genes, can be employed. By xenogenic human 
antibodies is meant antibodies that are fully human antibodies, with the exception that 
they are produced in a non-human host that has been genetically engineered to 
express human antibodies, (e.g., WO 98/50433; WO 98,24893 and WO 99/53049). 

[0543] Antibody fragments, such as Fv, F(ab}2 and Fab can be prepared by 
cleavage of the intact protein, e.g., by protease or chemical cleavage. These 
fragments can include heavy and light chain variable regions. Alternatively, a 
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truncated gene can be designed, e.g., a chimeric gene encoding a portion of the FCab'h 
fragment that includes DNA sequences encoding the CHI domain and hinge region of 
the H chain, followed by a translational stop codon. The antibodies of the present 
invention may be administered alone or in combination with other molecules for use 
as a therapeutic, for example, by linking the antibody to cytotoxic agent, as discussed 
above, or to a radioactive molecule. Radioactive antibodies that are specific to a 
cancer cell, disease cell, or virus-infected cell may be able to deliver a sufficient dose 
of radioactivity to kill such cancer cell, disease cell, or virus-infected cell. The 
antibodies of the present invention can also be used in assays for detection of the 
subject polypeptides. In some embodiments, the assay is a binding assay that detects 
binding of a polypeptide with an antibody specific for the polypeptide; the subject 
polypeptide or antibody can be immobilized, while the subject polypeptide and/or 
antibody can be detectably-labeled. For example, the antibody can be directly labeled 
or detected with a labeled secondary antibody. That is, suitable, detectable labels for 
antibodies include direct labels, which label the antibody to the protein of interest, and 
indirect labels, which label an antibody that recognizes the antibody to the protein of 
interest. 

[0544] These labels include radioisotopes, including, but not limited to Cu, 
67 Cu, "Y, ,24 1, 125 1, 13, I, ,37 Cs, 186 Re, 21 'At, 2,2 Bi, 2,3 Bi, 223 Ra, 241 Am, and 244 Cm; 
enzymes having detectable products (e.g., luciferase, p-galactosidase, and the like); 
fluorescers and fluorescent labels, e.g., as provided herein; fluorescence emitting 
metals, e.g., 1S2 Eu, or others of the lanthanide series, attached to the antibody through 
metal chelating groups such as EDTA; chemiluminescent compounds, e.g., luminol, 
isoluminol, or acridinium salts; and bioluminescent compounds, e.g., luciferin, or 
aequorin (green fluorescent protein), specific binding molecules, e.g., magnetic 
particles, microspheres, nanospheres, and the like. 

[0545] Alternatively, specific-binding pairs may be used, involving, e.g., a 
second stage antibody or reagent that is detectably-labeled and that can amplify the 
signal. For example, a primary antibody can be conjugated to biotin, and horseradish 
peroxidase-conjugated strepavidin added as a second stage reagent. Digoxin and 
antidigoxin provide another such pair. In other embodiments, the secondary antibody 
can be conjugated to an enzyme such as peroxidase in combination with a substrate 
that undergoes a color change in the presence of the peroxidase. The absence or 
presence of antibody binding can be determined by various methods, including flow 
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cytometry of dissociated cells, microscopy, radiography, or scintillation counting. 
Such reagents and their methods of use are well known in the art. 
e) Peptide Aptamers 

[0546] Another suitable agent for modulating an activity of a subject 
polypeptide is a peptide aptamer. Peptide aptamers are peptides or small polypeptides 
that act as dominant inhibitors of protein function. Peptide aptamers specifically bind 
to target proteins, blocking their functional ability (Kolonin and Finley, 1998). Due to 
the highly selective nature of peptide aptamers, they can be used not only to target a 
specific protein, but also to target specific functions of a given protein (e.g., a 
signaling function). Further, peptide aptamers can be expressed in a controlled 
fashion by use of promoters which regulate expression in a temporal, spatial or 
inducible manner. Peptide aptamers act dominantly, therefore, they can be used to 
analyze proteins for which loss-of-function mutants are not available. 

[0547] Peptide aptamers that bind with high affinity and specificity to a 
target protein can be isolated by a variety of techniques known in the art. Peptide 
aptamers can be isolated from random peptide libraries by yeast two-hybrid screens 
(Xu et al., 1997). They can also be isolated from phage libraries (Hoogenboom et al., 
1 998) or chemically generated peptides/libraries. 

Therapeutic Applications: Methods of Use 

[0548] The instant invention provides various therapeutic methods. In some 
embodiments, methods of modulating, including increasing and inhibiting, a 
biological activity of a subject protein are provided. In some embodiments, methods 
of modulating an enzymatic activity of a subject protein are provided. In some 
embodiments, methods of increasing the level of enzymatically active subject protein 
are provided, while in some embodiments, methods of decreasing a level of 
enzymatically active subject protein are provided. 

[0549] In some embodiments, methods of modulating enzymatic activity of a 
subject protein are provided. In other embodiments, methods of modulating a signal 
transduction activity of a subject protein are provided. In further embodiments, 
methods of modulating interaction of a subject protein with another, interacting 
protein or other macromolecule (e.g., DNA, carbohydrate, lipid) are provided. In 
further embodiments, methods of modulating transport activity of a subject protein are 
provided. In further embodiments, methods of modulating phopholipase activity of a 
subject protein are provided. In further embodiments, methods of modulating 
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polymerase activity of a subject protein are provided. In further embodiments, 
methods of modulating nuclease activity of a subject protein are provided. 

[0550] As mentioned above, an effective amount of the active agent (e.g., 
small molecule, antibody specific for a subject polypeptide, a subject polypeptide, or 
a subject polynucleotide) is administered to the host, where "effective amount" means 
a dosage sufficient to produce a desired effect or result. In some embodiments, the 
desired result is at least a reduction in a given biological activity of a subject 
polypeptide, as compared to a control, for example, a decreased level of enzymatically 
active subject protein in the individual, or in a localized anatomical site in the 
individual. In further embodiments, the desired result is at least an increase in a 
biological activity of a subject polypeptide as compared to a control, for example an 
increased level of enzymatically active subject protein in the individual, or in a 
localized anatomical site in the individual. 

[055 1 ] Typically, the compositions of the instant invention will contain from 
less than about 1% to about 95% of the active ingredient, about 10% to about 50%. 
Generally, between about 100 mg and about 500 mg will be administered to a child 
and between about 500 mg and about 5 grams will be administered to an adult. 

[05 52] Other effective dosages can be readily detennined by one of ordinary 
skill in the art through routine trials establishing dose response curves, for example, 
the amount of agent necessary to increase a level of active subject polypeptide can be 
calculated from in vitro experimentation. Those of skill will readily appreciate that 
dose levels can vary as a function of the specific compound, the severity of the 
symptoms, and the susceptibility of the subject to side effects, and preferred dosages 
for a given compound are readily determinable by those of skill in the art by a variety 
of means. For example, in order to calculate die polypeptide, polynucleotide, or 
modulator dose, those skilled in the art can use readily available information with 
respect to the amount necessary to have the desired effect, depending upon the 

particular agent used. 

[0553] The active agent(s) can be administered to the host via any 
convenient means capable of resulting in the desired result. Administration is 
generally by injection and often by injection to a localized area. The frequency of 
administration will be determined by the care given based on patient responsiveness. 
For example, the agents may be administered daily, weekly, or as conventionally 
determined appropriate. 
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[0554] A variety of hosts are treatable according to the subject methods. The 
host, or patient, may be from any animal species, and will generally be mammalian, 
e.g., primate sp., e.g., monkeys, chimpanzees, and particularly humans; rodents, 
including mice, rats and hamsters, guinea pig; rabbits; cattle, including equines, 
bovines, pig, sheep, goat, canines; felines; etc. Animal models are of interest for 
experimental investigations, providing a model for treatment of human disease. 

Proliferative Conditions 

[0555] In some embodiments, a protein of the present invention is involved 
in the control of cell proliferation, and an agent of the invention inhibits undesirable 
cell proliferation. Such agents are useful for treating disorders that involve abnormal 
cell proliferation, including, but not limited to, cancer, psoriasis, and scleroderma. 
Whether a particular agent and/or therapeutic regimen of the invention is effective in 
reducing unwanted cellular proliferation, e.g., in the context of treating cancer, can be 
determined using standard methods. For example, the number of cancer cells in a 
biological sample (e.g., blood, a biopsy sample, and the like), can be determined. The 
tumor mass can be determined using standard radiological or biochemical methods. 

[0556] Tumors that can be treated using the methods of the instant invention 
include carcinomas, e.g., colorectal, prostate, breast, bone, kidney, skin, melanoma, 
ductal, endometrial, stomach or other organ of the gastrointestinal tract, pancreatic, 
mesothelioma, dysplastic oral mucosa, invasive oral cancer, non-small cell lung 
carcinoma ("NSCL"), transitional and squamous cell urinary carcinoma; brain cancer 
and neurological malignancies, e.g., neuroblastoma, glioblastoma, astrocytoma, and 
gliomas; lymphomas and leukemias such as myeloid leukemia, myelogenous 
leukemia, hematological malignancies, such as childhood acute leukemia, non- 
Hodgkin's lymphomas, chronic lymphocytic leukemia, malignant cutaneous T-cell 
lymphoma, mycosis fungoides, non-MF cutaneous T-cell lymphoma, lymphomatoid 
papulosis, T-cell rich cutaneous lymphoid hyperplasia, bullous pemphigoid, discoid 
lupus erythematosus, lichen planus, and human follicular lymphoma; cancers of the 
reproductive system, e.g., cervical and ovarian cancers and testicular cancers; liver 
cancers including hepatocellular carcinoma ("HCC") and tumors of the biliary duct; 
multiple myelomas; tumors of the esophageal tract; other lung cancers and tumors 
including small cell and clear cell; Hodgkin's lymphomas; adenocarcinoma; and 
sarcomas, including soft tissue sarcomas. 
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Immtmotherapeutic Approaches to Proliferative Conditions 
[0557] The polynucleotides, polypeptides, and modulators of the present 
invention find use in immunotherapy of hyperproliferative disorders, including 
cancer, neoplastic, and paraneoplastic disorders. That is, the subject molecules can 
correspond to tumor antigens, of which 1 770 have been identified to date (Yu and 
Restifo, 2002). Immunotherapeutic approaches include passive immunotherapy and 
vaccine therapy and can accomplish both generic and antigen-specific cancer 
immunotherapy. 

[0558] Passive immunity approaches involve antibodies of the invention that 
are directed toward specific tumor-associated antigens. Such antibodies can eradicate 
systemic tumors at multiple sites, without eradicating normal cells. In some 
embodiments, the antibodies are combined with radioactive components, as provided 
above, for example, combining the antibody 's ability to specifically target tumors with 
the added lethality of the radioisotope to the tumor DN A. 

[0559] Useful antibodies comprise a discrete epitope or a combination of 
nested epitopes, i.e., a 10-mer epitope and associated peptide multimers incorporating 
all potential 8-mers and 9-mers, or overlapping epitopes (Dutoit et al., 2002). Thus a 
single antibody can interact with one or more epitopes. Further, the antibody can be 
used alone or in combination with different antibodies, that all recognize either a 

single or multiple epitopes. 

[0560] Neutralizing antibodies can provide therapy for cancer and 

proliferative disorders. Neutralizing antibodies that specifically recognize a secreted 
protein or peptide of the invention can bind to the secreted protein or peptide, e.g., in 
a bodily fluid or the extracellular space, thereby modulating the biological activity of 
the secreted protein or peptide. For example, neutralizing antibodies specific for 
secreted proteins or peptides that play a role in stimulating the growth of cancer cells 
can be useful in modulating the growth of cancer cells. Similarly, neutralizing 
antibodies specific for secreted proteins or peptides that play a role in the 
differentiation of cancer cells can be useful in modulating the differentiation of cancer 
cells. 

[0561] Vaccine therapy involves the use of polynucleotides, polypeptides, or 
agents of the invention as immunogens for tumor antigens (Machiels et al., 2002). 
For example, peptide-based vaccines of the invention include unmodified subject 
polypeptides, fragments thereof, and MHC class I and class II-restricted peptide 
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(Knutson et al., 2001), comprising, for example, the disclosed sequences with 
universal, nonspecific MHC class II-restricted epitopes. Peptide-based vaccines 
comprising a tumor antigen can be given directly, either alone or in conjunction with 
other molecules. The vaccines can also be delivered orally by producing the antigens 
in transgenic plants that can be subsequently ingested (U.S. Patent No. 6,395,964). 

[0562] In some embodiments, antibodies themselves can be used as antigens 
in anti-idiotype vaccines. That is, administering an antibody to a tumor antigen 
stimulates B cells to make antibodies to that antibody, which in turn recognize the 
tumor cells 

[0563] Nucleic acid-based vaccines can deliver tumor antigens as 
polynucleotide constructs encoding the antigen. Vaccines comprising genetic 
material, such as DNA or RNA, can be given directly, either alone or in conjunction 
with other molecules. Administration of a vaccine expressing a molecule of the 
invention, e.g.. as plasmid DNA, leads to persistent expression and release of the 
therapeutic immunogen over a period of time, helping to control unwanted tumor 
growth. 

[0564] In some embodiments, nucleic acid-based vaccines encode subject 
antibodies. In such embodiments, the vaccines (e.g., DNA vaccines) can include 
post-transcriptional regulatory elements, such as the post-transcriptional regulatory 
acting RNA element (WPRE) derived from Woodchuck Hepatitis Virus. These post- 
transcriptional regulatory elements can be used to target the antibody, or a fusion 
protein comprising the antibody and a co-stimulatory molecule, to the tumor 
microenvironment (Pertl et al., 2003). 

[0565] Besides stimulating anti-tumor immune responses by inducing 
humoral responses, vaccines of the invention can also induce cellular responses, 
including stimulating T-cells that recognize and kill tumor cells directly. For 
example, nucleotide-based vaccines of the invention encoding tumor antigens can be 
used to activate the GD8 + cytotoxic T lymphocyte arm of the immune system. 

[0566] In some embodiments, the vaccines activate T-cells directly, and in 
others they enlist antigen-presenting cells to activate T-cells. Killer T-cells are 
primed, in part, by interacting with antigen-presenting cells, i.e., dendritic cells. In 
some embodiments, plasmids comprising the nucleic acid molecules of the invention 
enter antigen-presenting cells, which in turn display the encoded tumor-antigens that 
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contribute to killer T-cell activation. Again, the tumor antigens can be delivered as 
plasmid DNA constructs, either alone or with other molecules. 

[0567] In further embodiments, RN A can be used. For example, dendritic 
cells can be transfected with RNA encoding tumor antigens (Heiser et al., 2002; 
Mitchell and Nair, 2000). This approach overcomes the limitations of obtaining 
sufficient quantities of tumor material, extending therapy to patients otherwise 
excluded from clinical trials. For example, a subject RNA molecule isolated from 
tumors can be amplified using RT-PCR. In some embodiments, the RNA molecule of 
the invention is directly isolated from tumors and transfected into dendritic cells with 
no intervening cloning steps. 

[0568] In some embodiments the molecules of the invention are altered such 
that the peptide antigens are more highly antigenic than in their native state. These 
embodiments address the need in the art to overcome the poor in vivo immunogenicity 
of most tumor antigens by enhancing tumor antigen immunogenicity via modification 
of epitope sequences (Yu and Restifo, 2002). 

[0569] Another recognized problem of cancer vaccines is the presence of 
preexisting neutralizing antibodies. Some embodiments of the present invention 
overcome this problem by using viral vectors from non-mammalian natural hosts, i.e., 
avian pox viruses. Alternative embodiments that also circumvent preexisting 
neutralizing antibodies include genetically engineered influenza viruses, and the use 
of Vnaked" plasmid DNA vaccines that contain DNA with no associated protein. (Yu 
and Restifo, 2002). 

[0570] All of the immunogenic methods of the invention can be used alone 
or in combination with other conventional or unconventional therapies. For example, 
immunogenic molecules can be combined with other molecules that have a variety of 
antiproliferative effects, or with additional substances that help stimulate the immune 
response, i.e., adjuvants or cytokines. 

[0571] For example, in some embodiments, nucleic acid vaccines encode an 
alphaviral replicase enzyme, in addition to tumor antigens. This recently discovered 
approach to vaccine therapy successfully combines therapeutic antigen production 
with the induction of the apoptotic death of the tumor cell (Y u and Restifo, 2002). 

[0572] In certain other embodiments, a DNA or RNA vaccine of the present 
invention can also be directed against the production of blood vessels in the vicinity 
of the tumor, a process called antiangiogenesis, thereby depriving the cancer cells of 
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nutrients. For example, the antiangiogenic molecules angiostatin (a fragment of 
plasminogen), endostatin (a fragment of collagen XVffl), interferon-y, interferon- 
y inducible protein 10, interleukin 12, tlirombospondin, platelet factor-4, calreticulin, 
or its protein fragment vasostatin can be used to treat tumors by suppressing 
neovascularization and thereby inhibiting growth (Cheng et al., 2001). The 
antiangiogenesis approach can be used alone, or in conjunction with molecules 
directed to tumor antigens. 

[0573] Furthermore, adjuvants can be used in conjunction with the 
antibodies and vaccines disclosed herein. Adjuvants help boost the general immune 
response, for example, concentrating immune cells to the specific area where they are 
needed. They can be added to a cancer vaccine itself or administered separately, and 
in some embodiments, a viral vector can be engineered to display adjuvant proteins on 
its surface. 

[0574] Cytokines can also be used to help stimulate immune response. 
Cytokines act as chemical messengers, recruiting immune cells that help the killer T- 
cells to the site of attack. An example of a cytokine is granulocyte-macrophage 
colony-stimulating factor (GM-CSF), which stimulates the proliferation of .antigen- 
presenting cells, thus boosting an organism's response to a cancer vaccine. As with 
adjuvants, cytokines can be used in conjunction with the antibodies and vaccines 
disclosed herein. For example, they can be incorporated into the antigen-encoding 
plasmid or introduced via a separate plasmid, and in some embodiments, a viral 
vector can be engineered to display cytokines on its surface. 

Inflammation and Immunity 

[0575] In other embodiments, e.g., where the subject polypeptide is involved 
in modulating inflammation or immune function, the invention provides agents for 
treating such inflammation or immune disorders. Disease states that are treatable 
using formulations of the invention include various types of arthritis such as 
rheumatoid arthritis and osteoarthritis, autoimmune thyroiditis, various chronic 
inflammatory conditions of the skin, such as psoriasis, the intestine, such as 
inflammatory bowel disease (IBD), insulin-dependent diabetes, autoimmune diseases 
such as multiple sclerosis (MS), intestinal immune disorders and systemic lupus 
erythematosis (SLE), allergic diseases, transplant rejections, adult respiratory distress 
syndrome, atherosclerosis, ischemic diseases due to closure of the peripheral 
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vasculature, cardiac vasculature, and vasculature in the central nervous system (CNS). 
After reading the present disclosure, those skilled in the art will recognize other 
disease states and/or symptoms which might be treated and/or mitigated by the 
administration of formulations of the present invention. 

[0576] Neutralizing antibodies can provide immunosuppressive therapy for 
inflammatory and autoimmune disorders. Neutralizing antibodies can be used to treat 
disorders such as, for example, multiple sclerosis, rheumatoid arthritis, inflammatory 
bowel disease,transplant rejection, and psoriasis. Neutralizing antibodies that 
specifically recognize a secreted protein or peptide of the invention can bind to the 
secreted protein or peptide, e.g., in a bodily fluid or the extracellular space, thereby 
modulating the biological activity of the secreted protein or peptide. For example, 
neutralizing antibodies specific for secreted proteins or peptides that play a role in 
activating immune cells are useful as immunosuppressants. 

Disorders Related to Cell Death 

[0577] mere a polypeptide of the invention is involved in modulating cell 
death, an agent of the invention is useful for treating conditions or disorders relating 
to cell death (e.g., DNAdamage, cell death, apoptosis). Cell death-related indications 
that can be treated using the methods of the invention to reduce cell death in a 
eukaryotic cell, include, but are not limited to, cell death associated with Alzheimer's 
disease, Parkinson's disease, rheumatoid arthritis, autoimmune myroiditis, septic 
shock, sepsis, stroke, central nervous system inflammation, intestinal inflammation, 
osteoporosis, ischemia, reperfusion injury, cardiac muscle cell death associated with 
cardiovascular disease, polycystic kidney disease, cell death of endothelial cells in 
cardiovascular disease, degenerative liver disease, multiple sclerosis, amyotropic 
lateral sclerosis, cerebellar degeneration, ischemic injury, cerebral infarction, 
myocardial infarction, acquired immunodeficiency syndrome (AIDS), 
myelodysplasia syndromes, aplastic anemia, male pattern baldness, and head injury 
damage. Also included are conditions in which DNA damage to a cell is induced by 
external conditions, including but not limited to irradiation, radiomimetic drugs, 
hypoxic injury, chemical injury, and damage by free radicals. Also included are any 
hypoxic or anoxic conditions, e.g., conditions relating to or resulting from ischemia, 
myocardial infarction, cerebral infarction, stroke, bypass heart surgery, organ 
transplantation, and neuronal damage, etc. 



161 



WO 2005/005597 



PCT/US2003/027106 



[0578] DNA damage can be detected using any known method, including, 
but not limited to, a Comet assay (commercially available from Trevigen, Inc.), which 
is based on alkaline lysis of labile DNA at sites of damage; and immunological assays 
using antibodies specific for aberrant DNA structures, e.g., 8-OHdG. 

[0579] Cell death can be measured using any known method, and is 
generally measured using any of a variety of known methods for measuring cell 
viability. Such assays are generally based on entry into the cell of a detectable 
compound (or a compound that becomes detectable upon interacting with, or being 
acted on by, an intracellular component) that would normally be excluded from a 
normal, living cell by its structurally and functionally intact cell membrane. Such 
compounds include substrates for intracellular enzymes, including, but not limited to, 
a fluorescent substrate for esterase; dyes that are excluded from living cells, including, 
but not limited to, trypan blue; and DNA-binding compounds, including, but not 
limited to, an ethidium compound such as ethidium bromide and ethidium 
homodimer, and propidium iodide. 

[0580] Apoptosis, or programmed cell death, is a regulated process leading 
to cell death via a series of well-defined morphological changes. Programmed cell 
death provides a balance for cell growth and multiplication, eliminating unnecessary 
cells. The default state of the cell is to remain alive. A cell enters the apoptotic 
pathway when an essential factor is removed from the extracellular environment or 
when an internal signal is activated. Genes and proteins of the invention that suppress 
the growth of tumors by activating cell death provide the basis for treatment strategies 
for hyperproliferative disorders and conditions. 

[0581] Apoptosis can be assayed using any known method. Assays can be 
conducted on cell populations or an individual cell, and include morphological assays 
and biochemical assays. A non-limiting example of a method of determining the level 
of apoptosis in a cell population is TUNEL (TdT-mediated dUTP nick-end labeling) 
labeling of the 3 -OH free end of DNA fragments produced during apoptosis 
(Gavrieli et al., 1992). The TUNEL method consists of catalytically adding a 
nucleotide, which has been conjugated to a chromogen system, a fluorescent tag, or 
the 3 -OH end of the 180-bp (base pair) oligomer DNA fragments, in order to detect 
the fragments. The presence of a DNA ladder of 1 80-bp oligomers is indicative of 
apoptosis. Procedures to detect cell death based on the TUNEL method are available 
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commercially, e.g., from Boehringer Mannheim (Cell Death Kit) and Oncor (Apoptag 
Plus). 

[0582] Another marker that is currently available is annexin, sold under the 
trademark APOPTEST™. This marker is used in the " Apoptosis Detection Kit," 
which is also commercially available, e.g., from R&D Systems. During apoptosis, a 
cell membrane's phospholipid asymmetry changes such that the phospholipids are 
exposed on the outer membrane. Annexins are a homologous group of proteins that 
bind phospholipids in the presence of calcium. A second reagent, propidium iodide 
(PI), is a DNA binding fluorochrome. When a cell population is exposed to both 
reagents, apoptotic cells stain positive for annexin and negative for PI, necrotic cells 
stain positive for both, live cells stain negative for both. Other methods of testing for 
apoptosis are known in the art and can be used, including, e.g., the method disclosed 
in U.S. Patent No. 6,048,703. 

Other Pathological Conditions 

[0583] Other pathological conditions that can be treated using the methods 
of the instant invention include disorders of hematopoeisis, cell differentiation, 
disorders of ion channels, e.g., cystic fibrosis, and tissue or organ hypertrophy, 
bacterial disorders, viral disorders, including acquired immunodeficiency syndrome 
(AIDS), angiogenesis, metastasis, metabolic disorders such as diabetes and obesity, 
cardiovascular disorders such as congestive heart failure and stroke, male erectile 
dysfunction, and the disorders described throughout the specification. 
In vestigative Applications 

[0584] The subject nucleic acid compositions find use in a variety of 
different investigative applications. Applications of interest include identifying 
genomic DNA sequence using molecules of the invention, identifying homologs of 
molecules of the invention, creating a source of novel promoter elements, identifying 
expression regulatory factors, creating a source of probes and primers for 
hybridization applications, identifying expression patterns in biological specimens; 
preparing cell or animal models to investigate the function of the molecules of the 
invention, and preparing in vitro models to investigate the function of the molecules 

of the invention. 

Genomic DNA Sequences 

[0585] Human genomic polynucleotide sequences corresponding to 
molecules of the present invention are identified by conventional means, such as, for 
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example, by probing a genomic DNA library with all or a portion of the 
polynucleotide sequences. 
Homologs 

[0586] Homologs are identified by any of a number of methods. By using 
probes, particularly labeled probes of DNA sequences, one can isolate homologous or 
related genes, as described in detail above. Briefly, a fragment of the provided cDNA 
can be used as a hybridization probe against a cDNA library from the target organism 
of interest, under various stringency conditions, e.g., low stringency conditions. The 
probe can be a large fragment, or one or more short degenerate primers, and is 
typically labeled. Sequence identity can be determined by hybridization under 
stringent conditions, as described in detail above. Nucleic acids having a region of 
substantial identity or sequence similarity to the provided nucleic acid sequences, for 
example allelic variants, related genes, or genetically altered versions of the gene, 
bind to the provided sequences under less stringent hybridization conditions. 

Promoter Elements and Expression Regulatory Factors 

[0587] The sequence of the 5' flanking region can be utilized as promoter 
elements, including enhancer binding sites that provide for tissue-specific expression 
and developmental regulation in tissues where the subject genes are expressed, 
providing promoters that mimic the native pattern of expression. Naturally occurring 
polymorphisms in the promoter region are useful for determining natural variations in 
expression, particularly those that may be associated with disease. Promoters or 
enhancers that regulate the transcription of the polynucleotides of die present 
invention are obtainable by use of PCR techniques using human tissues, and one or 
more of the present primers. 

[0588] Alternatively, mutations can be introduced into the promoter region 
to determine the effect of altering expression in experimentally defined systems. 
Methods for the identification of specific DNA motifs involved in the binding of 
transcriptional factors are known in the art, for example sequence similarity to known 
binding motifs, and gel retardation studies (Blackwell et al., 1995; Mortlock et al., 
1 996; Joulin and Richard-Foy, 1 995). 

[0589] The regulatory sequences can be used to identify cis acting sequences 
required for transcriptional or translational regulation of expression, especially in 
different tissues or stages of development, and to identify cis acting sequences and 
*ra/?.s-acting factors that regulate or mediate expression. Such transcription or 
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translations control regions can be operably linked to a gene in order to promote 
expression of wild type genes or of proteins of interest in cultured cells, embryonic, 
fetal or adult tissues, and for gene.therapy (Hooper, 1 993). 
Primers and Probes 

[0590] Small DNA fragments are useful as primers for reactions that involve 
nucleic acid hybridization, as described in detail above. Briefly, pairs of primers will 
be used in amplification reactions, such as PCR. Amplification primers hybridize to 
complementary strands of DNA, for example, under stringent conditions, and will 
prime towards each other. In some embodiments a pair of primers will generate an 
amplification product of at least about 50 nt, or at least about 100 nt. Algorithms for 
the selection of primer sequences are generally known, and are available in 
commercial software packages. 

[059 1 ] The nucleotides can also be used as probes to identify genomic DNA 
or gene expression in a biological specimen, as described above and as is well 
established in the art. Briefly, DNA or mRNA is isolated from a cell sample. 
Detection of mRNA hybridizing to the subject sequence is indicative of gene 
expression in the sample. The mRNA can be amplified by RT-PCR, using reverse 
transcriptase to form a complementary DNA strand, followed by polymerase chain 
reaction amplification using primers specific for the subject DNA sequences. 
Alternatively, the mRNA sample is separated by gel electrophoresis, transferred to a 
suitable support, e.g., nitrocellulose, nylon, etc., and then probed with a fragment of 
the subject nucleotides as a probe. Other techniques, such as oligonucleotide ligation 
assays, in situ hybridizations, and hybridization to probes arrayed on a solid chip may 
also find use. 

Targeted Mutations for In Vivo and In . Vitro Models 

[0592] The sequence of a gene according to the subject invention, including 
flanking promoter regions and coding regions, can be mutated in various ways known 
in the art to generate targeted changes, i.e., changes in promoter strength, or sequence 
of the encoded protein, etc. The DNA sequence or protein product of such a mutation 
will usually be substantially similar to the sequences provided herein. The sequence 
changes can be substitutions, insertions, deletions, or a combination thereof. 
Deletions can further include larger changes, such as deletions of a domain or exon. 

[0593] Techniques for in vifro mutagenesis of cloned genes are known. 
Examples of protocols for site specific mutagenesis may be found in Gustin et al., 
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1993; Barany 1985; Colicelli et aL, 1985; Prentki et aL, 1984. Methods for site 
specific mutagenesis can be found in Sambrook et aL, 1989 (pp. 15,3-15.108); Weiner 
et aL, 1993; Sayers et aL 1992; Jones and Winistorfer; Barton et aL, 1990; Marotti and 
Tomich 1989; and Zhu, 1989. Such mutated genes can be used to study structure- 
function relationships of the subject proteins, or to alter properties of the protein that 
affect its function or regulation. Other modifications of interest include epitope 
tagging, e.g., with hemagglutinin (HA), FLAG, or c-myc. For studies of subcellular 
localization, fluorescent fusion proteins can be used. 

[0594] The subject nucleic acids can be used to generate transgenic, non- 
human animals and/or site-specific gene modifications in cell lines; suitable methods 
are known in the art (Grosveld and Kollias, 1992; Hooper, 1993; Murphy and Carter, 
1993; Pinkert, 1994). Thus, in some embodiments, the invention provides a non- 
human transgenic animal comprising, as a transgene integrated into the genome of the 
animal, a nucleic acid molecule comprising a sequence encoding a subject 
polypeptide in operable linkage with a promoter, such that the subject polypeptide- 
encoding nucleic acid molecule is expressed in a cell of the animal. Either a complete 
or partial sequence of a gene native to the host can be introduced. Alternatively, a 
complete or partial sequence of a gene exogenous to the host animal, e.g., a human 
sequence of the subject invention, can be introduced. Transgenic animals can be 
made through homologous recombination, where the endogenous locus is altered. 
Thus, DMA constructs for homologous recombination will comprise at least a portion 
of the human gene or of a gene native to the species of the host animal, wherein the 
gene has the desired genetic modification(s), and includes regions of homology to the 
target locus. Methods for generating mammalian cells having targeted gene 
modifications through homologous recombination are known in the art (Keown et aL, 
1990). 

[0595] Alternatively, a nucleic acid construct is randomly integrated into the 
genome. Vectors for stable integration include plasmids, retroviruses and other 
animal viruses, and YACs. DNA constructs for random integration need not include 
regions of homology to mediate recombination. 

[0596] Conveniently, markers for positive and negative selection are 
included. A detectable marker, such as lac Z can be introduced into a locus at which 
up-regulation of expression will result in a detectable change in phenotype. 
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[0597] Transformed ES or embryonic cells can be used to produce 
transgenic animals. An embryonic stem (ES) cell line can be a source of embryonic 
stem cells, or they can be newly obtained from a host animal, e.g., a mouse, rat, or 
guinea pig. The cells are grown on an appropriate fibroblast-feeder layer or in the 
presence of leukemia inhibiting factor (LIF). Following transformation, the cells are 
plated for growth onto a feeder layer in an appropriate medium. Cells containing the 
relevant conslxuct can be detected by employing a selective medium and analyzing 
them for the occurrence of homologous recombination or integration of the construct. 
Positive colonies can be used for embryo manipulation and blastocyst injection. 
Blastocysts are obtained from 4 to 6 week old super-ovulated females. The ES cells 
are trypsinized, and the modified cells are injected into the blastocoel of the 
blastocyst. After injection, the blastocysts are returned to each uterine horn of 
pseudopregnant female animals that proceed to term. The resulting offspring are 
screened for the construct. By providing for a different phenotype of the blastocyst 
and the genetically modified cells, chimeric progeny can be readily detected. 

[0598] The chimeric animals are screened for the presence of the modified 
gene and males and females having the modification are mated to produce 
homozygous progeny. If the gene alterations cause lethality at some point in 
development, tissues or organs can be maintained as allogeneic or congenic grafts or 
transplants, or in in vitro culture. The transgenic animals can be any non-human 
mammal. 

[0599] The modified cells or animals are useful in the study of gene 
function and regulation. For example, a series of small deletions and/or substitutions 
can be made in the host's native gene to determine the role of different exons in 
biological processes such as oncogenesis or signal transduction Of interest is the use 
of genes to construct transgenic animal models for cancer, where expression of the 
subject protein is specifically reduced or absent Specific constructs of interest 
include anti-sense constructs, which will block expression, expression of dominant 
negative mutations, and gene over-expression. 

[0600] One can also provide for expression of the gene, e.g., a subject gene, 
or variants thereof, in cells or tissues where it is not normally expressed, at levels not 
normally present in such cells or tissues, or at abnormal times of development. One 
can also generate host cells (including host cells in transgenic animals) that comprise 
a heterologous nucleic acid molecule which encodes a polypeptide which functions to 
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modulate expression of an endogenous promoter or other transcriptional regulatory 
region, or the biological activity of a subject polypeptide. 

[0601 J The transgenic animals can also be used in functional studies, for 
example drug screening, to determine the effect of a candidate drug on a biological 
activity of a subject polypeptide. 
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Table 1. Characteristics of the Fantom Mouse Protein With the Highest Degree 
of Similarity to the Claimed Sequences 
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product [Mus musculus] 


HG1 00036 1N0 10000 gene_prediction 
1 


gi|26330472|dbj|BAC28966.1| unnamed protein 
product [Mus musculus] 


HG 1 00 1 3 8 1 N0_1 000_gene_prediction 1 


gi|26343077|dbj|BAC35195.1| unnamed protein 
product [Mus musculus] 


HG 1 000263M0_5000_gene_predictionl 


gi|26360198|dbj|BAB25612.2| unnamed protein 
product [Mus musculus] 


HG 1 001 052N0_0_genejpredictionl 


gi|20072693|gb|AAH27297.1| Similar to cyclin 
K [Mus musculus] 


HG1000498NO 160000 gene_predictio 
nl 


gi|26352844|dbj|BAC40052.1| unnamed protein 
product [Mus musculus] 


HG1000579NO 160000 gene_predictio 
nl 


gi|26330550|dbj|BAC29005.1| unnamed protein 
product [Mus musculus] 


HG1000685NO 160000_gene_predictio 
nl 


gi|6753236|ref]NP_033915.1| calcium channel, 
voltage dependent, alpha2/delta subunit 3; 
alpha 2 delta-3 [Mus musculus] 


HG1000191N0 160000 gene_predictio 
nl 


gi|13385832|reflNP_080608.1| RIKEN cDNA 
1810055D05 [Mus musculus] 


HG 1 000296N0_1 60000_gene_predictio 
n2 


gi|25054735|reflXP_192839.1| ATPas, class H, 
type 9B [Mus musculus] 


HG 1 0O0346N0_l 000_gene_predictionl 


gi|26330504|dbj|BAC2S9S2.1| unnamed protein 
product [Mus musculus] 


HG 1 000963N0_5000_genejpredictionl 


gi|12963665|reflNP 075892.11 mesoderm 
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d 
2 


evelopment candiate 2; RIKEN cDNA 

2 1 00 1 50 1 1 gene [Mus musculus] 1 


HG100061ON0 160000 gene_predictio g 
nl F 


;i|26335037|dbj|BAC31219.1| unnamed protein 
>roduct [Mus musculus] 1 


HG1000342N0 160000_gene_predictio 
nl 


ri|20881983|ref|XP_122793.1| similar to heat- 
table antigen-related hypothetical protein 
iSA-C - mouse [Mus musculus] 


HG 1 000342N0_1 60000_gene_predictio 
n2 


ri|20881983|reflXP_122793.1| similar to heat- 
;table antigen-related hypothetical protein 
HSA-C - mouse [Mus musculus] } 


HG1000650N0_20000_gene_prediction 
1 


? i|20270210|reflNP_083847.1| RIKEN cDNA I 
11 10001 A12 [Mus musculus] j 


HG 1 000 1 9 1N0_1 60000_gene_predictio 
n2 


gi|13385832|reflNP_08O608.1| RIKEN cDNA 
1 8 1 0055D05 [Mus musculus] 1 


HG 1 000449N0_1 60000_genejpredictio 
n3 


gi|6755773|reflNP_035705.1| trefoil factor 3, 
intestinal [Mus musculus] 


HG10001SlN0_20000igene prediction 
1 


gi|26334755|dbj|BAC31078.1| unnamed protein 
product [Mus musculus] 1 


HG 1 001 05 8N0_1 60000_gene_predictio 
nl 


gi|20344262|ref]XP_l 1 0959. 1 1 similar to j 
LD31582p [Drosophila melanogaster] [Mus 

musculus] 


HG1 000 1 87N0_1 6000d_gene_predictio 
n2 


gi|26346705|dbj|BAC37001.1| unnamed protein 
product [Mus musculus] 


HG1000191N0 1000 gene_pfedictionl 


gi|13385832|reflNP_080608.1| RIKEN cDNA 
1810055D05 [Mus musculus] 1 


HG 1 0003 1 9N0_1 60000_gene_predictio 
nl 


gi|25021456|reflXP_207950.1 1 similar to 1 
pORF2 [Mus musculus domesticus] j 


HG1000137N0 0 gene_predictionl 


gi|20843789|reflXP_133814.1| similar to | 
hypothetical protein IMAGE3455200 [Homo 
sapiens] [Mus musculus] 


HG1000191N0 5000 gene_predictionl 


gi|12842346|dbj|BAB25565.1| unnamed proteinl 
product [Mus musculus] 


HG 1 000622NO_1 60000_gene_predictio 
nl 


gi|25022040|reflXP_204233.1| sunilar to ORF2 
[Mus musculus domesticus] 


HG1000390NO 1000 gene_predictionl 


gi|20892585|reflXP_l 47977. 1| RIKEN cDNA 
2610001E17 [Mus musculus] 


HG1001350N0 5000 genejpredictionl 


gi|13386102|reflNP_080892.1| RIKEN cDNA 
1500026D16 [Mus musculus] 


HG1000327NO_1 60000_gene_predictic 
n2 


> gi|26324414|dbj|BAC25961.1| unnamed protein 
product [Mus musculus] 


HG 1 000 1 79N0_1 60000_gene_predictic 
nl 


mi9fiR69l?llref1XP 146270.11 similar to 
> putative alpha 1,3-fucosyl transferase [Mus 
musculus] 


HG1000S06N0 20000 gene prediction 


i e i|23592855|reflXP 129487.21 hypothetical 
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1 


protein MGC40674 [Mus musculus] 


HG1000991N0 160000 gene_predictio 
nl 


gi|6755338|ref|NP_036013.1| ring finger 
protein 13 [Mus musculus] 


HG1001489N0 20000 gene_prediction 
1 


gi|23592855iref]XP_129487.2| hypothetical 
protein MGC40674 [Mus musculus] 


HG 1 00 1 03 8N0_5000_gene_predictionl 


gi|20S92051|reflXP_148657.1| similar to 
Lethal(2)neighbour of tid protein 2 (NOT53) 
[Mus musculus] 


HG1001376N0 160000 gene_predictio 
n2 


gi|27261816|ref|NP_080861.1| PJKEN cDNA 
C530005J20 [Mus musculus] 


HG1001376N0 20000 gene_prediction 
2 


gi|27261816MNP_080861.1| RIKEN cDNA 
C530005J20 [Mus musculus] 


HG1001478N0 10000 gene_prediction 
1 


gi|6979907|gb| AAF34647. 1 |AF22 1 1 03_1 
kinesin-related protein KIFC5B [Mus 
musculus] 


HG1000806N0 160000 gene_predictio 
nl 


gi|23592855|ref]XP_129487.2| hypothetical 
protein MGC40674 [Mus musculus] 


HG1000409N0 160000 gene_predictio 
nl 


gi|3599320|gb|AAC72793.1| ORF2 [Mus 
musculus domesticus] 


HG1000884NO 160000 genejpredictio 
nl 


gi|26329055|dbj|BAC28266.1| unnamed protein 
product [Mus musculus] 


HG1000575NO 160000 gene_predictio 
nl 


gi|20889984|ref]XP_l 29281.1| RIKEN cDNA 
4930538D17 [Mus musculus] 


HG1000403NO 160000 genejpredictio 
nl 


gi|26340168|dbj|BAC33747.1| unnamed protein 
product [Mus musculus] 


HG1000906NO 10000 genejprediction 
1 


gi|20836822|ref]XP_l 30277. 1| similar to 
Plakopliilin 4 (p0071) [Mus musculus] 


HG 100 1201 NO 160000 gene_predictio 
nl 


gi|26341746|dbj|BAC34535.1| unnamed protein 
product [Mus musculus] 


HG1000485N0 160000 gene_predictio 
nl 


gi|23597904|ref]XP_l 29263.2| protein 
phosphatase 1, regulatory (inliibitor) subunit 3C 
[Mus musculus] 


HG1000328N0 160000 gene_predictio 
nl 


gi|26336731|dbj|BAC32048.1| unnamed protein 
product [Mus musculus] 


HG1 00023 1N0 160000 gene_predictio 
nl 


gi|26341 3 1 2|dbj|B AC343 1 8. 1 1 unnamed protein 
product [Mus musculus] 


HG1001257NO 10000 gene_prediction 
1 


gi|26346593|dbj|BAC36945.1| unnamed protein 
product [Mus musculus] 


HG1 000026N0_5000_gene__predictionl 


eil9506367lreflNP 062425 1 1 ATP-hindinff 
cassette, sub-family B, member 10; ATP- 
binding cassette, sub-family B (MDR/TAP), 
member 12; Abc-mitochondrial erythroid [Mus 
musculus] 


HG1000300N0 160000 gene predictio 


ei|12846244|dbi|BAB27089.1| unnamed protein 
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nl P 


roduct [Mus musculus] 


HG1000109N0 160000 _genejpredictio g 
nl "2 


;i|22779909|refINP_690028.1| RIKEN cDNA 
700083B01 [Mus musculus] 


g 
r 

HG1000617NO_20000 _gene_prediction s 
I i 


ji|79491 1 5|reflNP J)58079. 1 1 Ser/Arg-related 
mclear matrix protein; plenty-of-prolines-101 ; 
erine/arginine repetitive matrix protein 1 [Mus 
nusculus] 


HG 1 00 1 1 1 QNQ_1 60000_gene__predictio t 
nl : 


ri|22779909|ref]NP_690028.1| RIKEN cDNA 
>700083B01 [Mus musculus] 


HG1001334N0' 160000 gene_predictio 
nl 1 


5i|26332062|dbj|B AC29761.1| unnamed protein 
product [Mus musculus] 


HG1001376N0 160000 gene_predictio j 
n3 


gi|27261 8 1 6|reflNP_080861 . 1 1 RIKEN cDN A 
C5 30005 J20 [Mus musculus] 


. HG 1 000026N0_2000Q_gene_prediction 
1 


gi|9506367|reflNP_062425. 1 1 ATP-bmding 
cassette, sub-family B, member 10; ATP- 
binding cassette, sub-family B (MDR/TAP), 
member 12; Abc-mitochondrial erythroid [Mus 
musculus] 


HG1000276N0 1000 gene_predictionl 


gi|19527228|reflNP_59876S.l| DNA segment, 
Chr 10, ERATO Doi 214, expressed [Mus 
musculus] 


HG 1 000822N0_1 60000_gene_predictio 
n2 


gi|6680 1 95|ref]NP_032255 . 1 1 histone 
deacetylase 2; DNA segment, Chr 10, Wayne 
State University 179, expressed [Mus 
musculus] 


HG1000173NO_20000 _gene_prediction 
1 


gi|263451 10|dbj|B AC36204.1| unnamed protein 
product [Mus musculus] 


HG 1 000834NOJ 60000_gene_predictio 
nl 


gi|3599320|gb|AAC72793.1| ORF2 [Mus 
musculus domesticus] 


HG1001044N0 1000 gene_predictionl 


gi|26330836|dbj|BAC29148.1| unnamed protein 
product [Mus musculus] 


HG1000299N0 1000 genejpredictionl 


gi|6753882|reflNP_034349.1| FK506 binding 
protein 4 (59 kDa) [Mus musculus] 


HG 1 000752N0_1 0000_gene_prediction 
1 


gi|25955698|gb|AAH40387.1| Similar to 
PTPL1 -associated RhoGAP 1 [Mus musculus] 


HG1 000839N0_1 60000_gene_predictio 
n2 


gi|17512422|gb|AAH19171.1| Similar to 
RIKEN cDN A 23 1 00 1 0G 1 3 gene [Mus 
musculus] 


HG 1 000659N0_1 60000_gene_predictic 
nl 


> gi|26333733|dbj|BAC30584.1| unnamed protein 
product [Mus musculus] 


HG 1 000659N0_1 60000_gene_precncnc 
n2 


» fnl26333733ldbilBAC30584.1| unnamed protein 
product [Mus musculus] 


HG 1 0000 1 3N0_1 60000_gene_predictic 
nl 


> gi|20881 136|reflXP_126284.1| similar to sperm 
antigen HCMOGT-1 fHomo saoiensl TMus 
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musculus] 


HG1000173N0 160000 gene predictio 
nl 


gi|263451 10|dbj|BAC36204.1| unnamed protein 
product [Mus musculus] 


HG1000330N0 160000 gene_predictio 
nl 


gi|27462832|gb|AA01 5605.1 |AF462146_1 
modulator of estrogen induced transcription 
[Mus musculus] 


HG1000360N0 20000 gene_prediction 
1 


gi|786 1 746|gb| AAF70384. 1 1 AF 1 89263_1 
GABA-A receptor epsilon-like subunit [Mus 
musculus] 


HG1000178N0 10000 gene_prediction 
1 


gi|133S4830|ref]NP 079706.1| RIKEN cDNA 
1110066C01 [Mus musculus] 


HG1000178N0 10000 genejprediction 
2 


gi|13384830|reflNP 079706. 1| RIKEN cDNA 
1110066C01 [Mus musculus] 


HG 1 000360N0_20000_gene_prediction 
2 


gi|7861746|gb|AAF70384.1|AF189263_l 
GABA-A receptor epsilon-like subunit [Mus 
musculus] 


HG1000640N0 160000 gene_predictio 
nl 


gi|21313034|ref]NP 080346.1) RIKEN cDNA 
2900091 El 1 [Mus musculus] 


HG1001000N0 160000 gene_predictio 
nl 


gi|10181212|ref]NP 065613.1| RIKEN cDNA 
1 300007B 1 2; clone MNCb-275 5 [Mus 
musculus] 


HG1001418N0 160000 gene_predictio 
nl 


gi|208 1 9462|ref|XP_l 58058. 1 1 hypothetical 
protein XP l 58058 [Mus musculus] 


HG1000153N0 20000 gene_prediction 
1 


gi|26379523|dbj|BAB29070.2| mmamed protein 
product [Mus musculus] 


HG1 00025 5N0 160000 gene_predictio 
nl 


gi|13385532|ref]NP_080303.1| RIKEN cDNA 
2700086123 [Mus musculus] 


HG1000186N0 160000 genejpredictio 
nl 


gi|20963 1 96|ref]XP 1 35684. 1 1 RIKEN cDNA. 
1 700022L20 [Mus musculus] 


HG1000259N0 160000 gene_predictio 
nl 


gi|26360198|dbj|BAB25612.2| unnamed protein 
product [Mus musculus] 


HG1000559N0 10000 gene_prediction 
1 




HG1000084N0 10000 gene_prediction 
1 


gi|6678794|ref]NP_032953.1| mitogen activated 
protein kinase kinase 1; MAP kinase kinase 1; 
protein kinase, mitogen activated, kinase 1 , p45 
[Mus musculus] 


HG1000217N0 160000 gene_predictio 
nl 


gi|668 1 0 1 5|ref]NP_03 1 789. 1 1 cysteine rich 
intestinal protein [Mus musculus] 


HG 1 0002 1 7N0_1 60000_gene_predictio 
nz 


gi|6681015|reflNP_031789.1| cysteine rich 
intestinal protein [Mus musculus] 


HG1000329N0 160000 gene_predictio 
nl 


gi|26330S70|dbj|BAC29165.1| unnamed protein 
product [Mus musculus] 


HG1000570NO 160000 gene predictio 


gi|6716522|gb|AAF26675.1|AF155821 1 
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nl < 


:PG16 [Mus musculus] 


HG1000617N0 40000 gene prediction g 
1 1 


;i|3599320|gb|AAC72793.1| ORF2 [Mus 
nusculus domesticus] 


< 

HG1000227N0 160000 gene__predictio £ 
nl 1 


ri|21362402|sp|Q9CZB0|C560_MOUSE 
Juccinate dehydrogenase cytochrome b560 
jubunit, mitochondrial precursor (Integral 
nembrane protein CII-3) (QPS1) (QPs-1) 


HG1000269N0 10000 _gene_prediction { 
1 "' 1 


xi|7706341|reflNP_057145.1| yippee protein 1 
Homo sapiens] J 


i 

HG 1 0006 1 5N0_I60000_gene_predictio 
n2 


5i|4506725|reflNP_000998.1| ribosomal protein 
S4, X-linked X isoform; 40S ribosomal protein 
S4, X isoform; ribosomal protein S4X isoform; 
single-copy abundant mRNA; cell cycle gene 2 
[Homo sapiens] 


HG 1 0006 1 7N0 1 60000_gene_predictio 
nl 


gi|3599320|gb|AAC72793.1| ORF2 [Mus 
musculus domesticus] | 


HG1 00062 1N0_1 60000_gene_predictio 
n2 


gi|4506725|ref]NPJ)0099S.l| ribosomal protein 1 
S4, X-linked X isofomi; 40S ribosomal protein 
S4, X isoform; ribosomal protein S4X isoform; 
single-copy abundant mRNA; cell cycle gene 2 
[Homo sapiens] 


HG1 000990N0_1 60000_gene_predictio 
nl 


gi|10946760|ref|NP_067381.1| triggering 
receptor expressed on myeloid cells 1 ; 
triggering receptor expressed in monocytes 1 
[Mus musculus] 


HG1000998N0_160000_gene_j>redictio 
nl 


gi|6678483|ref]NP_033483.1| ubiquitin- 
activating enzyme El , Chr X [Mus musculus] 


HG1001225N0_160000_gene_predictio 
nl 


gi|101 81 1 92|reflNP_0655S9. 1| sulfotransferase- 
related protein SULT-X1 [Mus musculus] 


HG1001269N0 5000 genejpredictionl 


gi|21311883|reflNPj)80S87.1| RIKEN cDNA 
06 1 0007007 [Mus musculus] 


HG 1 00 1 269N0_1 60000_genejpredictio 
nl 


gi|21311S83|reflNP_080887.1| RIKEN cDNA 
0610007007 [Mus musculus] 


HG 1 000 1 03N0_1 60000_gene_predictio 
nl 


gi|26327721|dbj|BAC27604.1| unnamed protein 
product [Mus musculus] 


HG1000143N0 1000 gene_predictionl 


gi|14141 193|reflNP_001004.2| nbosomal | 
protein S9; 40S ribosomal protein S9 [Homo 
sapiens] 


HG1000396N0_160000_gene_predictio 
nl 


gi|25024769|reflXP_207 136.1| similar to ORF2 
[Mus musculus domesticus] 


HG 1 00 1 5 02NO_1 60000_gene_predictic 
n2 


> 

gi|2 1 44 1 00|pir||I64837 Set beta isoform - rat | 


HG1000066NO_160000_gene_predictic 
nl 


> gi|26337951|dbj|BAC32661.1| unnamed protein! 
product [Mus musculus] 
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HG 1 000078N0_1 000_gene_predictionl 


gi|26346587|dbj|BAC36942.1| unnamed protein 
product [Mus musculus] 


HG1000117N0 160000 gene_predictio 
nl 


gi|20875580|ref]XP_131 162.1| sorting nexin 7 
[Mus musculus] 


HG1000157NO 160000 gene_predictio 
nl 


gi|26344914|dbj|BAC36106.1.| unnamed protein 
product [Mus musculus] 


HG1000194N0 160000 gene_predictio 
nl 


gi|21313022|ref]NP_083674.1| RIKEN cDNA 
5730496E24 [Mus musculus] 


HG1000501N0 160000 gene_predictio 
nl 


gi|27370478|ref]NP_766552. 1 1 hypothetical 
protein E130310N06 [Mus musculus] 


HG1000656N0 10000 gene_prediction 
1 


gi|12855078|dbj|BAB30210.1| unnamed protein 
product [Mus musculus] 


HG1000656NO 10000 gene_prediction 
2 


gi|12855078|dbj|BAB30210.1| unnamed protein 
product [Mus musculus] 


HG1000750N0 160000 gene_predictio 
nl 


gi|26336392|dbj|BAC31881.1| unnamed protein 
product [Mus musculus] 


HG1001012NO 160000_gene_predictio 
nl 


gi|21312504|ref]NP 081554.1| RIKEN cDNA 
2810432D09 [Mus musculus] 


HG1001237N0 10000 gene_prediction 
1 


gi|20882986|reflXP_l 262 1 8. 1 1 similar to 
Hermansky-Pudlak syndrome protein variant 
[Rattus norvegicus] [Mus musculus] 


HG1000228N0 40000 gene_prediction 
1 


gi|26342390|dbj|BAC34S57.1| umiamed protein 
product [Mus musculus] 


HG1000228N0 20000 genejprediction 
1 


gi|13507676|ref)NP_l 09647. 1| pumilio 1 
(Drosophila) [Mus musculus] 


HG1000228NO 160000 gene_predictio 
nl 


gi|13507676|ref]NP_l 09647. 1| pumilio 1 
(Drosophila) [Mus musculus] 


HG1000390NO 160000 gene_predictio 
nl 


gi|20892585|ref]XP 147977.1| RIKEN cDNA 
261 0001 El 7 [Mus musculus] 


HG1000409N0 10000 genejprediction 
1 


gi|26006245|dbj|BAC41465.1| mKIAA1047 
protein [Mus musculus] 


HG1000611N0 160000 gene_predictio 
nl 


gi|6650539|gb|AAF21 895.1 |AF103877_1 
epsilon-sarcoglycan [Mus musculus] 


HG1000847NO 10000 gene_prediction 
1 




HG 1 0000 1 5N0_0_gene_prediction 1 


gi|20467423|ref|NP_620570. 1 1 chondroitin 
sulfate proteoglycan 4 [Mus musculus] 


HG 1 000088N0_5000_gene_predictionl 


gi|16741633|gb|AAH16619.1| pyruvate kinase 
3 [Mus musculus] 


HG1000143N0 10000 genejprediction 
1 


gi|20896345|ref|XP_l 28324.1 1 carbonyl 
reductase 3 [Mus musculus] 


HG 1 000 1 67N0_5000_gene_predictionl 


gi|12848663|dbj|BAB28043.1| unnamed protein 
product [Mus musculus] 
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HG1000243NO 5000 ^gene_predictionl g 


i|8393534|ref|NP_058653.1| high mobility 
iroup protein 1 7 [Mus musculus] 


HG1OOO825N0 160000 gene_predictio g 
nl < 


;i|2131 1983|ref|NP_080956.1| RIKEN cDNA 
I610012C01 [Mus musculus] 


HG1001019NO 1000 gene_predictionl 


ri|26343769|dbj|BAC35541 .1| unnamed protein 
>roduct [Mus musculus] 


HG1000044NO 160000 gene_predictio I 
nl 1 


ji|15079309|gb|AAHl 1494.1) Sumlar to 
Myosin of the dilute-myosin-V family [Mus 
nusculus] 


HG1000100NO' 10000 _gene_prediction | 
1 ) 


£i|4506127|reflNP_002755.1jphosphoribosyl 
pyrophosphate synthetase 1 [Homo sapiens] 


HG1000149NO 160000 gene_predictio 
nl 1 


gi|12834813|dbj|BAB23054.1| unnamed protein 
product [Mus musculus] 


HG1000183N0 1000_gene_predictionl 


gi|27370 1 50|reflNP_766364. 1 1 hypothetical 
protein D630002G06 [Mus musculus] 


HG 1 000 1 83N0_1 60000_gene_predictio 
n2 


gi|273701 50|reflNP_766364. 1 1 hypothetical 
protein D630002G06 [Mus musculus] 


HG1000213N0 5000 genejpredictionl 


gi|67531781reflNP_035923.1| breakpoint cluster 
region protein 1; barrier to autointegration 
factor [Mus musculus] 


HG1OOO294N0 5000 gene_predictionl 


gi|18390327|reflNP_083908.1| protein 
phosphatase 1, regulatory (inhibitor) subunit 
1 1 ; t-complex testis-expressed 5 [Mus 
musculus] 


HG1O0O33 lN0_160000_gene_predictio 
nl 


gi|20840824|ref)XP_141031.1| similar to slit 
homolog 1 (Drosophila); slit (Drosophila) 
homolog 1; slitl [Homo sapiens] [Mus 
musculus] 


HG 1 00039 1 N0_1 60000_gene_predictio 
n2 


gi|20887543|reflXP_134475.1| RIKEN cDNA 
2310022B05 [Mus musculus] 


HG 1 000430NO_1 60000_gene_predictio 
nl 


gi|26382861|dbj|BAC25510.1| unnamed protein 
product [Mus musculus] 


HG1000597NO_160000_gene _predictio 
nl 


gi|26325886|dbj|BAC26697.1| unnamed protein 
product [Mus musculus] 


HG1000078N0 5000 gene_predictionl 


gi|26346587|dbj|BAC36942.1| unnamed protem 
product [M us musculus] 


HG1000139N0 5000 gene_predictionl 


gi|23597632|ref]XP 127052.2| similar to 
hypothetical protein FLJ1392Q [Homo sapiens] 
[Mus musculus] 


HG1000143N0J60000 _gene_predictio 

1 

nl 


gi|20896345|reflXP_l 28324.1 1 carbonyl 
reductase 3 [Mus musculus] 


HG 1 000 1 62N0_1 60000_genejpredictic 
nl 


» gi|20835770|ref]XP 1 321 27. 1| similar to 60S 
RIBOSOMAL PROTEIN LI 3 [Mus musculus] 


HG1000168N0 160000 gene predictic 


> eill2841593ldbilBAB25272.il unnamed protem 
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nl 


product [Mus musculus] 


HG10001S7N0 160000 genejpredictio 
nl 


gi!3599320l,gb|AAC72793.1| ORF2 [Mus 
musculus domesticus] 


HG1000247N0 160000 gene_predictio 
nl 


gi|7656920|refINP_056547.1| axin2 [Mus 
musculus] 


HG1000273NO 160000 gene_predictio 
n2 


gi|25030042|ref]XP_207307.1| similar to 
Retrovirus-related PQL polyprotein [Mus 
musculus] 


HG1000415N0 10000_gene_prediction 
1 


gi|9367840|emb|CAB97523. 1 1 hypothetical 
protein, weakly similar to (AF102871) neuronal 
apoptosis inhibitory protein 2 [Mus musculus] 
[Homo sapiens] 


HG1000539N0 160000 gene_predictio 
nl 


gi|7521942|pir||T29096 gag polyprotein - 
murine endogenous retrovirus ERV-L 


HG1000539N0 160000 gene_predictio 
n2 


gi|7521942|pir||T29096 gag polyprotein - 
murine endogenous retrovirus ERV-L 


HG1000560N0 160000 gene_predictio 
nl 


gi|12860683|dbj|BAB32021.1| unnamed protein 
product [Mus musculus] 


HG1000618N0 10000 gene_prediction 
1 


gi|26350749|dbj|BAC3901 1 . 1 1 unnamed protein 
product [Mus musculus] 


HG1000740N0 160000 gene_predictio 
nl 


gi|23601536|ref|XP_l 30965 .2| Nice-4 protein 
homolog [Mus musculus] 


HG1001197N0 160000 gene_predictio 
nl 


gi|26327779|dbj|B AC27630.1| unnamed protein 
product [Mus musculus] 


HG 1 000599N0_5000_gene_prediction 1 


gi|12836542|dbj|BAB23701.1| unnamed protein 
product [Mus musculus] 


HG1000020N0_5000_gene_predictionl 


gi|20887101|reflXP_129228.1| similar to 
phosphoglucomutase 5 [Homo sapiens] [Mus 
musculus] 


HG 1 000084N0_5000_gene_predictionl 


gi|6678794|ref|NP_032953.1| mitogen activated 
protein kinase kinase 1 ; MAP kinase kinase 1 ; 
protein kinase, mitogen activated, kinase 1, p45 
[Mus musculus] 


HG1 000 1 35N0_5000_gene_predictionl 


gi|21312189|ref|NP_081 197.1| RIKEN cDNA 
1810010A06 [Mus musculus] 


HG1000169N0 20000 gene_prediction 
1 


gi|20886743|reflXP_12921 1.1| phosphoserine 
aminotransferase [Mus musculus] 


HG1000169N0 160000 gene_predictio 
nl 


gi|20886743|reflXP_129211.1| phosphoserine 
aminotransferase [Mus musculus] 


HG1000189N0 160000 gene_predictio 
nl 


gi|20879992|ref]XP_140210.1| similar to 
BG:DS0 1759.1 gene product [Drosophila 
melanogaster] [Mus musculus] 


HG1000189N0 160000 gene_predictio 
n2 


gi|20879992|ref]XP_l 402 1 0. 1 1 similar to 
BG:DS01759.1 gene product [Drosophila 
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n 


lelanogaster] [Mus musculus] 1 


g 

• C 

a 

HG1000246N0 5000 gene j»redictionl n 


i|21450297|reflNP_659157.1| UDP- 
JalNAc-.polypeptide N- 
cetylgalactosaminyltransferase [Mus 

ausculus] 


HG1000248N0 0 gene_predictionl [ 


?|9790219|reflNP_062745.1| destnn; Sid23p | 
Mus musculus] 


HG1000288N0_100Q0 _gene_prediction £ 
1 I 


ri|20909512|reflXP_153447.1| hypothetical 
>roteinXP 153447 [Mus musculus] j 


I 

HG1000424N0 5000 gene_predictionl i 


p|2503 1 822|refiXP_207741 . 1 1 hypothetical 
>roteinXP 207741 [Mus musculus] 1 


HG1000443NO 40000 _gene_prediction j 
1 1 


p|26354072|dbj|BAC40666.1| unnamed protein 
aroduct [Mus musculus] — — -1 


1 

HG1000590NO 1000 gene_predictionl ] 


gi|26378096|dbj|BAB28595.2| unnamed proteinl 
product [Mus musculus] 


HG100O626N0_16O0QO_gene _predictio 
nl 


gi|9938030|reflNP 064667. 1| hypothetical | 
protein, MNCb-4193; hypothetical protein 
MNCb-4193 [Mus musculus] 


HG 1 00087 1 N0_1 60000_gene_predictio 
nl 


gi|6752958|ref]NP_033742.1| activin A I 
receptor, type Il-like 1 ; activin receptor-like 
kinase-1 [Mus musculus] 


HG 1 000959N0_1 0000_gene_prediction 
1 


gi|22507385|refjNP_08 1 019. 1| RIKEN cDNA 
1 1 10014F12 [Mus musculus] 


HG 1 000961NO_1 60000_gene_predictio 
n3 


gi|20822904|reflXP_l 3 1 914. 1 1 RIKEN cDNA 
3 1 1 00040 1 8 [Mus musculus] J 


HG1000974NO 5000 gene predictionl 


gi|26378096|dbj|BAB28595.2| unnamed protein 
product [Mus musculus] 


HG 1 00 1 045N0_1 60000_gene_predictio 
nl 


gi|25020138|ref|XP_207789.1| similar to 
Retrovirus-related POL polyprotein [Mus 
musculus] 


HG1001110NO 0 gene_predictionl 


gi|23956080|reflNP_058675.1| putative 
serine/threonine kinase [Mus musculus] 


HG1001223N0 1000 _genejpredictionl 


gi|26339658|dbj|BAC33500.1| unnamed protein 
product [Mus musculus] 


HG 1 00 1 28 1N0_1 60000 _gene_predictio 
nl 


gi|15431279|reflNP_203538.1| dedicator of 
cvto-kinesis 2 [Mus musculus] 


HG1001317N0 5000 gene_predictionl 


gi|26327365|dbj|BAC27426.1| unnamed protein 
product [Mus musculus] 


HG1001485N0 5000 gene_predictionl 


gi|26327365|dbj|BAC27426.1| unnamed protein 
product TMus musculus] 


HG 1 000674N0_1 60000_gene_predictic 
nl 


gi|242 1 1 88 1 |sp|Q8 VCR8|KML2_MOUSE 
> Myosin light chain kinase 2, skeletal/cardiac 
muscle (MLCK2) 


HG1001017N0 10000 gene predictioi 


1 gi|250 1 983 1 IreflXP 207463. 1 1 similar to 
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1 


CD59B [Mus musculus] 


HG1 001 01 7N0_1 000_gene_predictionl 


gi|25019831|ref)XP_207463J| similar to 
CD59B [Mus musculus] 


HG1000014N0 160000_gene_predictio 
n2 


gi|6680744|ref|NP_03 1528.1| ATPase, Na+/K+ 
transporting, beta 3 polypeptide; ATPase, 
Na+/K+ beta 3 polypeptide [Mus musculus] 


HG1000043N0 160000 gene_predictio 
n3 


gi|26337385|dbj|BAC32378.1| unnamed protein 
product [Mus musculus] 


HG1000052NO 160000 gene_predictio 
nl 


gi|3599320|gb|AAC72793.1 1 ORF2 [Mus 
musculus domesticus] 


HG 1 000084N0_5000_gene_prediction2 


gi|6678794|ref]NP_032953.1| mitogen activated 
protein kinase kinase 1; MAP kinase kinase 1 ; 
protein kinase, mitogen activated, kinase 1 , p45 
[Mus musculus] 


HG 1 000093N0_1 000_gene_predictionl 


gi|26350865|dbj|BAC39069.1| unnamed protein 
product [Mus musculus] 


HG1000105N0 160000 genejpredictio 
nl 


gi|14198371|gb|AAH08247.1| Similar to cyclin 
B2 [Mus musculus] 


HG 1 0001 57N0_1 000_gene_predictionl 


gi|5803225|reflNP_006752.1| tyrosine 

^ /i~r\7ir\tc\T\n^Yi S -tn nnf\AY\;ff An q op Qf»fura+inn 

protein, epsilon polypeptide; 14-3-3 epsilon; 
mitochondrial import stimulation, factor L 
subunit; protein kinase C inhibitor protein- 1 
[Homo sapiens] 


HG1000210N0 40000_gene_prediction 
1 


gi| 171 60840|gb| AAH17597.1| RIKEN cDNA 
5830401B18 gene [Mus musculus] 


HG 1 000242N0_5 O00_gene_predictionl 


gi|9789937|refINP 062768. 1| DnaJ(Hsp40) 
homolog, subfamily A, member 2; DNA J 
protein [Mus musculus] 


HG1000243N0_5000_gene_prediction2 


gi|8393534|ref|NP_058653.1| high mobility 
group protein 17 [Mus musculus] 


HG1000256NO 160000_gene_predictio 
nl 


gi|13959400|sp|Q9R0Y5|KADl_MOUSE 
Adenylate kinase isoenzyme 1 (ATP- AMP 
transphosphorylase) (AK1) (Myokinase) 


HG 1 000279N0_0_gene_predictionl 


gi|15617203|ref]NP_254279.1| chloride 
intracellular channel 1 [Mus musculus] 


HG 1 000280N0_5000_genejpredictionl 


gi|7 1 06337|refINP_034796. 1 1 keratin complex- 
1, gene C29 [Mus musculus] 


HG 1 0002S0N0_5000_gene_prediction2 


gi|7106337|reflNP_034796.1| keratin complex- 
1, gene C29 [Mus musculus] 


HG1000282N0 160000 gene_predictio 
nl 


gi|20902823|ref|XP_128021.1| similar to 
Mitochondrial import receptor subunit TOM22 
homolog (Translocase of outer membrane 22 
kDa subunit homolog) (hTom22) (1C9-2) [Mus 
musculus] 
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nl 



nl 



HG1000330N0_20000_gene_prediction 



HG~1000339N0 160000_gene_predictio L|26350551|dbj|BAC38915.1| unnamed protein 
~ jproduct [Mus musculus] 



i^55340N6' 160000 gene_predictio |gi|209 12S42MXPJ 26689 1| KIKE* cDNA 
-■ ~ 3300001P08 [Mus musculus] 



Fantom Top Hit Annotation 



[G1000292N0_160000_gene_predictio 



gi|6981488|reflNP_037356.1| ribosomal protein 
S26 [Rattus norvegicus] 



[Gl 0003 1 3N0_1 60000_gene_predictio 



gi|4506283|reflNP_003454.1| protein tyrosine 
phosphatase type IVA, member 1 ; Protein 
tyrosine phosphatase IVA1 [Homo sapiens] 



gi|2212251 l|reflNP_666146.1| hypothetical 
>roteinMGC30562 [Mus musculus] 



^° iUU " - I protein MGC27983 [Mus musculus] 



HGTO0O365N0 20000_gene prediction gi|25046794|reflXP_207489.1| similar to RNP 
HURU - - e particle component [Mus musculus] 



iGTb^i4W"l60000 gene_predictio Jgi|20909520|ret|XP_126941.1| RIKEN cDNA 

, — .. 2600011C06 [Mus musculus] 
nl 1 — ■ 



tel000448N0 160000 gene_predictio gi!6678247|reflNP_033358 1| inscription 

, - ~ lfactor 7-like 1 fMus m usculus] 
nl ! — — - 



551555^ 160000 genejpredictio gi^ 

~ [product [Mus musculus] 



IegTo^So^^ 

- jproduct [Mus musculus] 



klMSiSiiSli^BBne^ctio |gi|20909520|reflXP_126941 1| RIKiiJN cDNA 

, 1 _ _ & 260001 1C06 [Mus musculus] 
nl 1 — 



^oTTi^l^O gene jpredictio gi|26351279|dbj|BAC39276.1| unnamed protein 
" ~ ~ product [Mus musculus] 

HG1000550N0 160000 ge ne _predictio |gi|20909520|ref]XP_126941.1| RIKEN cDNA 

, - ~ 260001 1C06 [Mus musculus] 
nl — • 



.gi|25031497|reflXP_207552.1| similar to 
HG1000556NO_160000_gene _predictio Retrovirus-related POL polyprotein [Mus 
' ^ Jmusculus] 



|gi| 1 3277747|gb| AAH03768. 1 1 interferon- 
HG1000588N0_160000_gene _predictio induced protein with tetratricopeptide repeats 1 
j |[M us musculus] 



gi|20863376|reflXP_134148.1| similar to 
HG1000600N0_160000_gene_predictio hypothetical protein [Macaca fascicular*] [Mus 

I musculus] : 

' : L|9506517|reflNP_062338.1| cytotoxic and 

HG1000647N0 160000 gene jpredictio regulatory T cell molecule; class I-restricted T 
HG1WU04anu_i _ & _f Leu-associated m olecule fMus musculus] 

nl 1 — - — — — — - 



^^^1^6 gene_predictio teb _ 900199MXP-128639 1| RIKEN cDNA 

, _ _ & I2810055C19 [Mus musculus] 
nl 1 — 



^T^nj^o T^OO gene predictio t^l2632^ 7071dbi|BAC^597.1| unnamed protein 
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nl 


product [Mus musculus] 


HG1000696N0 160000 gene predictio 
nl 


gi|3599320|gb|AAC72793.1j ORF2 (Mus 
musculus domesticus] 


HG1000788N0 160000 gene_predictio 
nl 


gi|2084791 2|ref|XP_l 4461 0.1 1 similar to 
KIAA1904 protein [Homo sapiens] [Mus 
musculus] 


HG1000874NO 160000 gene_predictio 
nl 


gi|20342176|ref]XP_l 10490.1| similar to 
hvpothetical protein MGC955 THomo sar>iens1 
[Mus musculus] 


HG1000902N0 20000 gene_prediction 
1 


gi|6753324|ref|NPJ)33968.lj chaperonin 
subunit 6a (zeta); chaperonin containing TCP-1 
[Mus musculus] 


HG1000902N0 160000 gene_predictio 
n2 


gi|6753324|ref]NPJ)33968.1| chaperonin 
subunit 6a fzeta^: chaneronin containing TCP-1 
[Mus musculus] 


HG1000902NO_1 000_gene_predictionl 


gi|6753324ireflNP 033968 11 chaoeronin 
subunit 6a (zeta); chaperonin containing TCP-1 
[Mus musculus] 


HG1000904N0 160000 gene_predictio 
nl 


gi|3599320|gb|AAC72793.1| ORF2 [Mus 
musculus domesticus] 


HG1000966N0_1000_gene_predictionl 


gi|22 1 226 1 7|ref|NP_6662 1 5 . 1 1 hypothetical 
protein MGC255 1 1 [Mus musculus] 


HG1 000966N0_5000_gene_predictionl 


gi|22 1 2261 7|ref|NP_66621 5. 1 1 hypothetical 
protein MGC255 1 1 [Mus musculus] 


HG1000994N0 160000 gene_predictio 
nl 


gi|12855175|dbj|BAB30238.1| unnamed protein 
product [Mus musculus] 


HG1001014N0 160000 gene_predictio 
n3 


gi|26337385|dbj|BAC32378.1| unnamed protein 
product [Mus musculus] 


HG1001041N0_5000_gene_predictionl 


gi|25071304|reflXP_146497.3| similar to 
protein serine kinase Pskhl [Mus musculus] 


HG1001337NO 160000 gene_predictio 
nl 


gi|27369704|reflNP_766096. 1 1 hypothetical 
protein 6030499008 [Mus musculus] 


HG1001417N0_5000_gene_predictionl 


gi|26349767|dbj|BAC38523.1| unnamed protein 
product [Mus musculus] 


HG1001485N0 160000 gene_j>redictio 
nl 


gi|7513636|pir||T30805 duttl protein - mouse 


HG1000151NO 160000 gene_predictio 
nl 


gi|18044328|gb|AAH19573.1| Unknown 
(protein for IMAGE:3990036) [Mus musculus] 


HG1000330NO 160000 gene_predictio 
n3 


gi|2502981 l|refjXP_207217.1| similar to ORF2 
[Mus musculus domesticus] 


HG1000957N0 20000 gene_prediction 
1 


gi|25024769|refpO>_207 136.1| similar to ORF2 
[Mus musculus domesticus] 


HG 1 000960N0_0_gene_predictionl 


gi|20908689|reflXP 1 27449. 1 1 RIKEN cDNA 
4632401C08 [Mus musculus] 
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g 

HG1000960NO 0 gene_prediction2 4 


♦ionnno/:QOi«»flYP 1 9744Q 1 1 RTKEN cDNA 
63240 1 COS [Mus musculus] 


HG1001280N0 20000 gene_prediction g 
1 ~ F 


'iinii4:n&i\AU\\Ti A f*'* 9 064 1 1 unnamed nrotein 
product [Mus musculus] 


HG1001502N0_160000_genej>redictio g 
nl E 


MioT^7n9dnirpflMP 766415 11 hypothetical 
Protein 4732490P 1 8 [Mus musculus] 


HG1000003N0 10000 gene_prediction £ 
1 


n|136243Uj|rei|lNr_i iz £ + £ *\j.i\ procollagen, 
ype II, alpha 1 [Mus musculus] 


HG1000041N0 160000 _gene_predictio 
nl '"' I 


yi|263901oy|aDj|r>ACZDo3H-.i| uiuidiiicu piuicui 
Droduct [Mus musculus] 


HG1000043N0 160000 gene_predictio | 
n2 1 


5i|26337385|dbj|BAC32378.1| unnamed protem 
Droduct [Mus musculus] 


HG1000044N0 5000 gene _predictionl 


gi|15079309|gb|AAH11494.1| Similar to 
Myosin 01 the ailute-myosin- v iamiiy iiviua 
musculus] 


HG1 00005 1N0_1 60000_genejpredictio 
nl 


gi|14250190|gb|AA±lUojl^.i| mteneron 
regulatory factor 6 [Mus musculus] 


HG 1 000057N0_1 60000_gene jpredictio 
nl 


gi|6755040|rei|iNP_Ui!>2uz.l| promin 1, aiAin 
binding protein [Mus musculus] 


HG 1 000060NO_1 60000_gene_predictio 
nl 


jgi|6755901|rei|Nr — xjdj /oo.i| iuduiui, aipna i, 
tubulin alpha 1 [Mus musculus] 


HG 1 00006 1N0_1 0000_gene jprediction 
1 


gi|20827552|reHAr_l mzi**. i \ expreb&eu 
sequence AW6 10751 [Mus musculus] 


HG 1 000079N0_1 60000_gene_predictio 
nl 


gi|20887309|rei|XJr_lzyzUU.i| aaenyicuc Kinase 
3 alpha like [Mus musculus] 


HG 1 000098N0_1 60000_gene_predictio 
nl 


•irt/-/, yin/r/c/cii-iu^iii a r^^^oo^ 1 1 unnamed nrotein 
gij26340ooo|duj|t>A^3 i| uniidiiicu puiwn 

product [Mus musculus] 


HG1000105N0 5000 gene_predictionl 


•11 '-»o^n/CAAM'UilTi ATl / >R'7RS 1 1 unnamed nrotein 
gl|12o5UoUU|QDJ|l5/vDZO /o->.l| uiuicuiiou. piuivui 

product [Mus musculus] 


HG1 0001 2 1N0_1 60000_gene jpredictio 
nl 


gl|263464U2|uDJ|r>A^jOojZ.i| uiuiaincu. piuiv/m 
product [Mus musculus] 


HG10001 3 1N0_1 60000_gene_predictio 
nl 


•lozriom oiMUiiT-i a f^OR^^O 1 1 unnamed nrotein 
product [Mus musculus] 


HG 1 000 1 34N0_1 60000_gene_predictio 
nl 


•ii ^o/zninn\AVx\\vi A"R^ 1 Q^4 11 unnamed Drotein 
gi|12ooU3 / /|aDj|r>/yjj3 lyj**. i \ uiuwuk/u f aw 

product [Mus musculus] 


HG 1 000 1 34N0_1 60000_gene_predictio 
n2 


•m oo/cnmMKiiu a 1 0^4 11 unnamed Drotein 
product [Mus musculus] 


HG1 0001 36N0_1 60000_gene_predictio 
nl 


• i^/ci on< i ouui in a ^74 S 1 1 unnamed Drotein 
gi 263 o;0 iy|uDj|r>Av-'Zj uiiiiouk** p vtvlA1 

product [Mus musculus] 


HG 1 000 1 47N0_1 60000_gene _predictic 
|ni 


» gi|3717978|emb|CAA73041.1| 5S ribosomal 
protein [Mus musculus] 


HG1000166NO_160000_gene _j>redictic 
nl 


gi|2090S717|ref|XP_127445.1| similar to 
) flavoprotein subunit of succinate-ubiquinone 
reductase [Rattus norvegicus] [Mus musculus] 
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HG 1 0001 72N0_1 000_gene_predictionl 


gi|6681095|ref]NP_03 1834.1| cytochrome c, 
somatic [Mus musculus] 


HG1000172N0_1000_genejprediction2 


gi|6681 095 |ref]NP_03 1834.1| cytochrome c, 
somatic [Mus musculus] 


HG 10001 75N0_5000_gene_predictionl 


gi|26354216|dbj|BAC40736.1| unnamed protein 
product [Mus musculus] 


HG1000175N0 10000 gene_prediction 
1 


gi|26354216|dbj|BAC40736.1| unnamed protein 
product [Mus musculus] 


HG1000175N0 160000 gene_predictio 
nl 


gi|26354216|dbj|BAC40736.1| unnamed protein 
product [Mus musculus] 


HG1 0001 75N0_1000_gene_predictionl 


gi|263542 1 6|dbj|BAC40736. 1 1 unnamed protein 
product [Mus musculus] 


HG1000192NO 160000 gene_predictio 
nl 


gi|l 09466 14|ref]NP_067287.1| WD repeat 
domain 12; nuclear protein Ytml [Mus 
musculus] 


HG1000193NO 160000 gene_predictio 
n2 


gi|21728370|ref]NP_080178.1| RIKEN cDNA 
1500009M05 [Mus musculus] 


HG1000195N0 160000 gene_predictio 
nl 


gi|17390530|gb|AAH18231.1| Unknown 
(protein for MGC: 19236) [Mus musculus] 


HG1000197N0 160000 genejpredictio 
nl 


gi|214501 85|ref|NP_659063.1| hypothetical 
protein MGC28186 [Mus musculus] 


HG1000202N0 20000 gene_prediction 
1 


gi|26331946|dbj|BAC29703.1| unnamed protein 
product [Mus musculus] 


HG1000210N0 20000 gene_prediction 
1 


gi|17160840|gb|AAH17597.1| RIKEN cDNA 
5830401B18 gene [Mus musculus] 


HG 1 0002 1 8N0_1 000_gene_prediction 1 


gi|6681015|ref]NP_031789.1| cysteine rich 
intestinal protein [Mus musculus] 


HG1000218N0 160000 gene_predictio 
nl 


gi|668 101 5|ref]NP_03 1 789. 1 1 cysteine rich 
intestinal protein [Mus musculus] 


HG1000218N0 10000 genejjrediction 
1 


gi|668 101 5|ref|NP_03 1 789. 1 1 cysteine rich 
intestinal protein [Mus musculus] 


HG1000222N0_1000_gene_predictionl 


gi|13385054|reflNP_079873.1| RIKEN cDNA 
270003311 6 [Mus musculus] 


HG 1 000233N0_1 000_gene_predictionl 


gi|12847362|dbj|BAB27541.1| unnamed protein 
product [Mus musculus] 


HG 1 000234N0_l O00_gene_predictionl 


gi|12847362|dbj|BAB27541.1| unnamed protein 
product [Mus musculus] 


HG1000234N0 160000 genejpredictio 
nl 


gi|12847362|dbj|BAB27541.1| unnamed protein 
product [Mus musculus] 


HG1 00023 8N0 160000 gene_predictio 
n2 


gi|6671549|refINP_03 1479. 1 1 anti-oxidant 
protein 2; acidic calcium-independent 
phospholipase A2; peroxiredoxin 5; 1-Cys Prx 
[Mus musculus] 


HG1000240N0 160000 gene predictio 


gi|26328673|dbi|BAC28075. 11 unnamed protein 
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product rMus musculus] 

^iioc^ni^9lHViilRAR9R604 11 unnamed protein 
product [Mus musculus] 


HG1000245N0 5000 gene_predictionl 


rr;noc^ni^9IHhilRAR28604 11 unnamed protem 

2111 /OJul j£l\AUJ\XJl^MJ^<J\7\J-T» a | ********* XT 

product [Mus musculus] 


HG1 000249N0_1 0000_gene_prediction 
1 


^i^^A^^Ait-^flMP fR4905 1 1 mannose binding 

^llO / JHOJ^|rei|*NJr Uj*t7v;j.i] *i*«****v«^ o 

lectin, liver (A) [Mus musculus] 


HG1000251N0_160000_gene_predictio 


^lorkQQia.^iivaflYP 196911 11 Dullard 
homolog [Mus musculus] 


HG1000252N0 5000 gene_predictionl 


^lonoo^^lrMTYP 199507 Urine finger 
protein 2 [Mus musculus] 


HG1000254N0_l60000_gene_predictio 
nl 


^naiQ^n^sir^fllsJP 079878 1 1 hvDOthetical 
protein D10Ertd718e [Mus musculus] 


HG 1 000262N0_1 60000_gene_predictio 

nl 


gi|21312163|reflNP_082683.1| RIKEN cDNA 
|2900054P12 [Mus musculus] 


■ : &121624617|reflNP_081018.1| RIKEN cDNA 

TinmnO^INO 1(¥ i«"««*ml 1110007M04 [Musmusculus] 


HG1000264N0 1000 gene_prediction2 


|gi|21624617|ref|NP_081018.1| RIKEN cDJN A 
1110007M04 [Mus musculus] 


HG1 000270N0_20000_gene_prediction 
i 


gi|12844196|dbj|BAB26273.1| unnamed protein 
product fMus musculus] 


|gi|12852884|dbj|BAB29566.1| unnamed protein 

HrjmO0270N0 1000 gene predictionl product [Mus musculus] 


HG1000274N0 160000 gene_predictio gi|26347831|dbj|BAC37564.1| unnamed protem 

product [Mus musculus] 


" gi|1952722S|reflNP 598768. 1 1 DNA segment, 

HG1000276N0 160000 gene_predictio Chr 10, ERATO Doi 214, expressed [Mus 
j musculus] 


- — ■ gi|1952722S|reflNP_598768.1| DNA segment, 

Chr 10, ERATO Doi 214, expressed [Mus 

HG1000276NO 5000 genejpredictionl musculus] _ 


" gi|19527026|reflNP 598568.1| expressed 

HG1000278NO 5000 gene_prediction1 secuence AA959742 [Mus musculus] 


HG1000280NO 160000 gene_predictio gi|7106337|reflNP_034796.1| keratm complex- 
- ~ l gene C29 [Mus museums] 


- ; gil7106337|reflNP_034796.1| keratin complex- 

u^nnrmnNO 1000 eene predictionl 1 , gene C29 [Mus musculus]_ 


HG1000280N0 160000 gene_predictio gi|7106337MNP_034796.1| keratm complex- 
" 1 gene C29 [Mus musculus] 


" " Kn06337|ref|NP_034796.1| keratin complex- 

urnnnrmONO 1000 eene yr^ninn?. 1 . gene C29 [Mus musculus] 


~ gi|27369902|reflNP 766218.1| hypothetical 

HG1000305N0_5000_ g ene predictionl jprotem A530095G11 [Musmusculus] ] 



185 



WO 2005/005597 



PCT7US2003/027106 



FPID 


Fantom Top Hit Annotation 


HG1000305N0_5000_gene_prediction2 


gi|27369902|ref|NP_76621 8.1| hypothetical 
protein A530095G1 1 [Mus musculus] 


HG1000307NO_160000 gene_predictio 
nl 


gi|8393853|ref)NP_058614.1| nudix (nucleoside 
diphosphate linked moiety X)-type motif 5 
[Mus musculus] 


HG1000334NO 160000 gene_predictio 
nl " 


gi|20888553|ref|XP_134832.1| similar to 
Probable serine/threonine protein kinase 
SNF1LK [Mus musculus] 


HG1000335N0 160000 gene_predictio 
nl 


gi|20888553|ref|XP_l 34832. 1 1 similar to 
Probable serine/threonine protein kinase 
SNF1LK [Mus musculus] 


HG 1 0003 3 7N0_5 O00_gene_prediction 1 


gi|12851918|dbj|BAB29207.1| unnamed protein 
product [Mus musculus] 


HG1000343N0 160000 gene_predictio 
nl 


gi|3599320|gb|AAC72793.1| ORF2 [Mus 
musculus domesticus] 


HG1000343NO 160000 gene_predictio 
n2 


gi|133S6340|ref]NP 083008. 1| RIKEN cDNA 
4632428N05 [Mus musculus] 


HG1000369N0 160000 gene_predictio 
nl 


gi|12837873|dbj|BAB23982.1| unnamed protein 
product [Mus musculus] 


HG1000372NO 160000 gene_predictio 
nl 


gi|20913947|ref|XP_126555.1| RIKEN cDNA 
1190006K01 [Mus musculus] 


HG1000378N0 160000 gene_predictio 
nl 


gi|26348995|dbj|BAC38 137.1| unnamed protein 
product [Mus musculus] 


HG1000387NO 160000 genejpredictio 
nl 


gi|3599320|gb|AAC72793.1| ORF2 [Mus 
musculus domesticus] 


HG1000387N0 160000 gene_predictio 
n2 


gi|263 8286 l|dbj|BAC255 10.1| unnamed protein 
product [Mus musculus] 


HG 1 000397N0_5000_gene_prediction 1 


gi|20836469|refIXP_l 297 1 7. 1 1 hypothetical 
protein XP 129717 [Mus musculus] 


HG1000408N0 160000 gene_predictio 
nl 




HG1000414NO 160000 gene_predictio 
nl 


gi|3599320|gb|AAC72793.1| ORF2 [Mus 
musculus domesticus] 


HG1 00043 1N0 160000 gene_predictio 
nl 


gi|8394057|ref]NP_058565.1| low density 
lipoprotein receptor-related protein 4; low 
density lipoprotein-related protein 4; Low 
Density Lipoprotein Receptor Related Protein 
4; corin [Mus musculus] 


HG1000439N0 160000 gene_predictio 
nl 


gi|12851918|dbj|BAB29207.1| unnamed protein 
product [Mus musculus] 


HG 1 000449N0_20000_gene_prediction 
1 


gi|250251 17|ref)XP_207206.1| similar to 
transcription factor-like nuclear regulator; 
putative transcription regulation nuclear 
orotein; putative transcription factor-like 
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HG1O00457N0 I60000_gene_predictio gi|20824761|ref|XP_133346.1| liver-specific 
HG1000457MU_iouuuu_g _p &^ ^ transcriptio n factor [Mus musculus] 

nl ' — - - : " ~ 



HG1000458N0 I 60000_gene_predictio Ull2841242|dbj|BAB25129.1| unnamed protein 
j ~~ [product [Mus musculus] 



ii^554WT60000 gene_predictio gi|25032310|rellXF • 205729.1| hypothetical 
•- • - lprotein XP_205729 fMus musculus] 



HG1000463N0 lo^000 gene_predictio L|128610681dbj|BAB32114.1| unnamed protein 
j ~ (product [Mus musculus] 



HG1000463NO 160000 gene _predictio gi| 1324935 l|ref|NP_076402 t.l| inositol- 
H01UUU^inu_ _e _f Irequiring 1 alpha (yeast) [Mus musculus] 



HG1000476N0 160000 gene_predictio gi|26332657|dbj|BAC30046.1| unnamed protein 
. ~~ " | product [Mus musculus] 



1HG1000481N0 160000 "le^edictio gi|2131 1873lreflNP_077181.1| KlKEN cDNA 

, - 1061 0007 A03 [Mus m usculus] 
nl 1 — 



HG1000530N0 160000 genejredictio gi|20860491|reflXP '153755. 1| hypothetical 
HUiuuwovi _ _ 6 -f |p r otein XP 153755 [Mus musculus] 



.gi|25031497|reflXP_207552.1| similar to 
HG1000556N0_160000_gene_predictio Retrovirus-related POL polyprotein [Mus 
^2 jmusculus] 



Io7b005S4N0 160000 gene _predictio gi|273 70500|reflNP_76658 1.1 1 hypothetical 
HUluuio&^u. _s _f |protein D230008H22 [Mus musculus] 



iiGTb00587N0 160000 gene_predictio gi|236i2i49|reflXP_l 58842.2| hypothetical 
HU1UUUD6/inu_i _f lu tein XP 158842 fMus musculus] 



^b^i^O^O gene_predictio U|263iii99|dbj|BAC3»439.1| unnamed protem 
j product [Mus musculus] 

iGl^i¥ 160000 gene _predictio gi|2209i5l5|reflNP_084065 1| RIKEN cDNA 

, _ ~ 0610013117 [Mus musculus] 
nl 1 ■ 



i^^n^b"0O gene_predictio gi|22095015MNP_084<>65 1| RIKEN cDNA 
" - " l0610013I17 [Mus musculus] 

Vil ' — — " 



lgi|20345223|ref|XP_l 09778.1| similar to 
Neurabin-II (Neural tissue-specific F-actin 
binding protein II) (Protein phosphatase 1 
HG1000608N0 160000 gene_predictio regulatory subunit 9B) (Spinophilin) (pl30) 
"j - | (PPlbpl34) [Mus musculus] 



: ' gi|25052462|reflXP_l 38105.3| similar to TAR 

HG1000620N0_160000_gene_predictio DNA-binding protein-43 (TDP-43) [Mus 
n j [musc ulus] 



[HG1000621N0 160000 gene jredic¥o gi|3599320|gblAAC72^ORF2 [Mus 

, " musculus domesticusj 

nl . — 1 - • 
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HG1000621N0 160000 genejpredictio 
n3 


gi|26382861|dbj|BAC25510.1| unnamed protein 
product [Mus musculus] 


HG1000631N0_40000_gene_prediction 
1 


gi|6681283|reflNP_031938.1| epidermal growth 
factor receptor; avian erythroblastic leukemia 
viral (v-erb-b) oncogene homolog [Mus 
musculus] 


HG1000652N0 160000 gene_predictio 
nl 


gi|25030122|ref)XP_207332.1| similar to 
endonuclease/reverse transcriptase [Mus 
musculus] 


HG1000663NO 160000 gene_predictio 
nl 


gi|209 1 54 1 6|ref)XP_l 62987. 1 1 hypothetical 
protein XPl 62987 [Mus musculus] 


HG1000686N0 160000 gene_predictio 
nl 


gi|3599320|gb|AAC72793.1| ORF2 [Mus 
musculus domesticus] 


HG1000700N0 160000 gene_predictio 
nl 


gi|16508047|gb|AAL17972.1| pORF2 [Mus 
musculus domesticus] 


HG1 00070 1N0 160000 gene_predictio 
nl 


gi|26327 1 67|dbj|BAC27327. 1 1 unnamed protein 
product [Mus musculus] 


HG1000709N0 160000 gene_predictio 
nl 


gi|220579|dbj|BAA00448.1| open reading 
frame (196 AA) [Mus musculus] 


HG1000712N0 1 60000_gene_predictio 
nl 


gi|12841826|dbj|BAB25366.1| unnamed protein 
product [Mus musculus] 


HG1000720N0 160000 gene_predictio 
nl 


gi|7657415|refINP_035986.2| odd Oz/ten-m 
homolog 2 (Drosophila); odd Oz/ten-m 
homolog 3 (Drosophila) [Mus musculus] 


HG1000727N0 160000 gene_predictio 
nl 


gi|26335645|dbj|BAC3 1 523. 1 1 unnamed protein 
product [Mus musculus] 


HG1000743N0 160000 gene_predictio 
n2 


gi|26338S34|dbj|BAC33088.1| unnamed protein 
product [Mus musculus] 


HG1000767N0_5000_gene_predictionl 


gi|12851918|dbj|BAB29207.1| unnamed protein 
product [Mus musculus] 


HG1000786N0 160000 gene_predictio 
n2 


gi|6678303|ref]NP_033386. 1 1 transcription 
factor A, mitochondrial [Mus musculus] 


HG1000822N0_160000_gene_predictio 
nl 


gi|6680195|ref|NP_032255.1| histone 
deacetylase 2; DNA segment, Chr 10, Wayne 
State University 179, expressed [Mus 
musculus] 


HG1000829N0 160000 gene_predictio 
nl 


gi|21450159|refJNP_659049.1| cDNA sequence 
BC024131; hypothetical protein MGC37S96 
Mus musculus] 


HG1000848N0 160000 genejpredictio 
nl 


gi|26350995|dbj|BAC39134.1| unnamed protein 

"vrnHiif*t TlV/fiiQ mncr»n1no1 


HG1000860NO 160000 genejpredictio 
nl 


gi|26325678|dbj|BAC26593.1| unnamed protein 
product [Mus musculus] 


HG1000898N0 10000 gene prediction 


ei|21450209|ref|NP 659075.11 hvpothetical 
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1 



protein MGC25509 [Mus muscuhis] 



HG 1 00089SN0_1 60000_gene_predictio 
nl . 



gi|2 1 450209|ref|NP_659075 . 1 1 hypothetical 
protein MGC25509 [Mus musculus] 



HG1000898N0_20000_gene_prediction 



1 



HG 1 000902N0_1 60000_gene_predictio 
nl . 



HG 1 000904N0 1 60000_gene_predictio 
n3 



antom Top Hit Annotation 



gi|21450209MNP_659075.1| hypothetical 
protein MGC25509 fMus musculus] 



gi|21450209|reflNP_659075.1| hypothetical 
protein MGC25509 fMus musculus] 



gi|6753324|reflNP_033968.1| chaperonin 
subunit 6a (zeta); chaperonin containing TCP-1 
fMus musculus] 



HG1000921N0 5000_gene prediction! 



gi|263461 14|dbj|BAC36708.1| unnamed protein 
product fMus musculus] 



HG 1 000961N0_1 60000_gene_predictio 



nl 





gi|3599320|gb|AAC72793.1| ORF2 [Mus 
musculus domesticus] . 



HG1000961N0_160000_gene_predictio 
n2 



gi|25051287|reflXP_l 46665.3| similar to 
<QAA0S77 protein [Homo sapiens] [Mus 
musculus] 



HG 1 001 OO0NO_l 60000_gene_predictio 

n2 _ 



gi|20859143|ref|XP_127126.1| similar to 
eukaryotic initiation factor 5 [Rattus 
norvegicus] [Mus musculus] 



HG1001003N0_160000_gene_predictio 



nl 



HG1001007N0_l60000_gene_predictio 

ni : 



HG 1 00 1 009N0 0 gene prediction! 



gi|19527072|reflNP_598613.1| expressed 
sequence AW555139 [Mus musculus] 



gi|13277825|gb|AAH03796.1| Similar to 
lymphocyte specific 1 [Mus musculus] 



gi|26334641|dbj|BAC31021 .1| unnamed protein 
product [Mus musculus] 



HG1001014N0_l60000_gene_predictio 
n2 . 



gi|26329567|dbj|BAC2S522.1| unnamed protein 
product [Mus musculus] 



HG1001017N0_40000_gene_prediction 



1 



HG1001017N0_20000_gene_prediction 



1 



HG1001 144N0_1 60000_gene_predictio 
nl 



gi|26337385|dbj|BAC32378.1| unnamed protein 
product [Mus musculus] 



gi|25019831|reflXP_207463.1| similar to 
CD59B [Mus musculus] 



gi|25019831|reflXP_207463.1| similar to 
CD59B [Mus musculus] 



HG 1 001 1 72N0_1 60000_gene_predictio 
n2 



gi|3599320|gb| AAC72793.il ORF2 [Mus 
musculus domesticus] 
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HG1001214NO 20000 gene_prediction 
1 


gi|26340706|dbj|BAC34015.1| unnamed protein 
product [Mus musculus] 


HG1001229N0 160000 gene_predictio 
nl 




HG1001253N0 160000 gene_predictio 
nl 


gi|3599320|gb| AAC72793. 1 1 ORJF2 [Mus 
musculus domesticus] 


HG1001253N0 160000_gene_predictio 
n2 


gi|26326251|dbj|BAC26869.1| unnamed protein 
product [Mus musculus] 


HG1001267NO 160000 gene_predictio 
nl 


gi|26326251|dbj|BAC26869.1| unnamed protein 
product [Mus musculus] 


HG1001289N0 160000 gene_predictio 
nl 


gi|3599320|gb|AAC72793.1| ORF2 [Mus 
musculus domesticus] 


HG1001343N0 10000 gene_prediction 
1 


gi|26333317|dbj|BAC30376.1| unnamed protein 
product [Mus musculus] 


HG1001343N0 160000 gene_predictio 
nl 


gi|6755060|ref]NP J)352 14. 1 1 
phosphatidylinositol 3-kinase, C2 domain 
containing, gamma polypeptide [Mus 
musculus] 


HG1001390N0 160000 gene_predictio 
nl 


gi|6755060|ref)NP J)352 14. 1 1 
phosphatidylinositol 3-kinase, C2 domain 
containing, gamma polypeptide [Mus 
musculus] 


HG1001468N0 160000 gene_predictio 
nl 


gi|6680083|reflNP_032 189.1| growth factor 
receptor bound protein 2 [Mus musculus] 


HG1001508N0 160000 gene_predictio 
n2 


ri|25030495|reflXP 205178 1 1 similar tn 
bAl 30N24. 1 (novel protein similar to REV3L 
(REV3 (yeast homolog)-like, catalytic subunit 
of DMA polymerase zeta) (POLZ)) [Homo 
sapiens] [Mus musculus] 


HG1000084N0 160000 gene_predictio 
nl 


gi|26382861|dbj|BAC25510.1| unnamed protein 
product [Mus musculus] 


HG1000084NO 160000 gene_predictio 
n2 


gi|2503 1 822|ref|XP_20774 1 . 1 1 hypothetical 
protein XP 207741 [Mus musculus] 


HG1000209NO 160000 gene_predictio 
nl 


gi|2503 1 822|ref]XP_207741 . 1 1 hypothetical 
protein XP 207741 [Mus musculus] 


HG1000382N0 160000 gene_predictio 
nl 


gi|20858167|reflXP__125585.1| similar to 
PTD013 protein; CGI-24 protein [Mus 
musculus] 


HG1 00059 1N0 160000 gene_predictio < 
nl 


gi|6678716|ref]NP_032539.1| low density 
ipoprotein receptor-related protein 5; low 
density lipoprotein-related protein 5 [Mus 
musculus] 


HG1000904N0 160000_gene_predictio 
n4 


?i|26330005|dbj|BAC28741 . 1 1 unnamed protein 
product [Mus musculus] 
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HO1000014N0 160000 jenejredictio gi|3 S 99 320|gb|AAC72793. 1 | OKW [Mus 

, — musculus domesticus] 
nl . 1 



jgi|6680744|refINP_031528.1| ATPase, Na+/K+ 

WG1000015N0 160000 gene _predictio transporting, beta 3 polypeptide; ATPase, 
HG10UUUi3NU_iouu W _s _f L a +/K+ heto 3 polype ptide [Mus musculus] 
nl 1 ~ ' ■ 



HG1000015NO ~20000 genejrediction gi|20467423|ref|NP_620570.11 chondroitin 
HGlUUUupiNU_zuuu _ B _p Uifate proteoglycan 4 [Mus musculus] 



Jgi|20467423|reflNP_620570. 1 1 chondroitin 
HG1000015NO <nnn gene prediction! sulfate proteoglycan 4 fMus musculus] 



HGlO0OO15N0"l60000 g enejredictio gi|20467423lreflNP_620570.1| chondroitin 
HG10UWlDJNU_iov;u _ 6 _f . ^ proteogl ycan 4 [Mus musculus] 

n2 — ' — " ~~ : ~ 



ii_M0 0020NO 160000 ge ne_predictio gi|20467423|reflNP_620570.l| chondroitin 
HGlUUUlUiMNU_iow _s _f ^ proteoglyc an 4 fMus musculus] 
nl 1 " 



gi|26330706|dbj|BAC29083.1| unnamed protein 
HG1000020NQ 5000 -gene prediction2 product fMus musculus] 



jgi|20887101|ref|XP_129228.1| similar to 
HG1000024N0_10000_gene_prediction phosphoglucomutase 5 [Homo sapiens] [Mus 

1 jmusculus] — 

I5I™oT6NO ,60000_gene_ P red 1 C i o Ll» S^ldbilBAB^.,, unnamed proiein 

n - product [Mus musculus] 

ti|9506367|reflNP_062425.1|ATP-binding 

cassette, sub-family B, member 10; ATP- 
binding cassette, sub-family B (MDR/TAP), 
HG1000030N0_160000_gene_predictio member 12; Abc-mitochondrial erythro.d [Mus 

- |musculus] 

HO1000O39N0 160000 _gene_predic.i° 8i |26U06203|dbj|BAC41444.1| mKIAA0696 
" — [protein [Mus musculus] 



lgi|7106453|ref|NP_035897.1| zinc finger RNA 

HG1000 041N0 5000_genejredictica ^binding protein [Mus musculus] ^_ 

H^b^Tl6_^^ 

~ | product [Mus musculus] I 



HG1000043N0 5000_gene prediction! 



gi|26337385|dbj|BAC32378.1| unnamed protein 
product [Mus musculus] 



HG1000044NO_20000 _gene_prediction 



1 



HGl000052N0_160000_gene_predictio 
n2 



gi|26337385|dbj|BAC32378.1| unnamed protein 
product [Mus musculus] 



HG1000052N0_10000_gene_prediction 



gi|15079309|gb|AAHl 1494.11 Similar to 
Myosin of the dilute-myosin-V family [Mus 
musculus] 



HG 1 000052N0_20000_gene_prediction 



gi|26324852|dbj|BAC26180.1| unnamed protein 

pro duct [Mus musculus] 

gi|26324852|dbj|BAC26180.11 unnamed protein 
product [Mus musculus] 
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HG1000058NO 10000 gene_prediction 
1 


gi|26324852|dbj|BAC2618,0.1| unnamed protein 
product [Mus museums] 


HG1 000061N0_5000_gene_predictionl 


gi|3599320|gb|AAC72793.1| ORF2 [Mus 
musculus domesticus] 


HG1 000065N0_5000_gene_predictionl 


gi|503 1571 |ref|NP_0057 1 3. 1 1 actin-related 
protein 2; ARP2 (actin-related protein 2, yeast) 
homolog [Homo sapiens] 


HG1000065N0 10000_gene_prediction 
1 


gi|13386220]reflNP 081610.1| RIKEN cDNA 
22 1 0414H1 6 [Mus musculus] 


HG1000065N0 160000 gene_predictio 
nl 


gi| 1 33S6220|reflNP 08 1 61 0. 1 1 RIKEN cDNA 
2210414H16 [Mus musculus] 


HG1000068N0 160000 gene_predictio 
nl 


gi|13386220|reflNP_081610.1| RIKEN cDNA 
2210414H16 [Mus musculus] 


HG 1 000070N0_0_gene_prediction 1 


gi|26326191|dbj|BAC26839.1| unnamed protein 
product [Mus musculus] 


HG1000073NO 20000 genejprediction 
1 


gi!21595527|gb| AAH32275.il Similar to 
receptor-like tyrosine kinase [Mus musculus] 


HG1000075NO 160000 gene_predictio 
nl 


gi|26326407|dbj|BAC26947.1| unnamed protein 
product [Mus musculus] 


HG1000076N0 160000 gene__predictio 
nl 


gi|3599320|gb|AAC72793.1| ORF2 [Mus 
musculus domesticus] 


HG1 000081 NO 160000 gene_predictio 
nl 


gi|4502549|reflNP_001 734.1| calmodulin 2 
(phosphorylase kinase, delta); phosphorylase 
tinase delta [Homo sapiens] 


HG1000106N0 160000 gene_predictio 
nl 


gi|6680305|reflNP_032328.1| heat shock 
protein, 84 kDa 1 [Mus musculus] 


HG1000107NO 160000 gene_predictio 
nl 


gi|668 1225|reflNP_03 1 905. 1 1 developmental^ 
regulated GTP binding protein 1; 
developmentally regulated GTP-binding J 
protein 1 [Mus musculus] 


HG 1 000 1 09N0_0_gene_prediction 1 


gi|6754774|ref]NP_034986.1| myosin heavy 
chain, cardiac muscle, adult; alpha cardiac 
MHC; alpha myosin [Mus musculus] 


HG1000112NO 160000 gene_predictio 
nl 


gi|23956080|ref]NP_058675. 1 1 putative 
serine/threonine kinase [Mus musculus] 


HG1000116NO 160000 gene_predictio 
nl 


gi|3599320|gb|AAC72793.1| ORF2 [Mus 
musculus domesticus] 


HG1000126N0 160000 gene__predictio 
nl 


gi|6680305|reflNP_032328.1| heat shock 
protein, 84 kDa 1 [Mus musculus] 


HG1000130NO 160000 gene_predictio 
nl 


gi|20825377|reflXP_143696.1| similar to 
hypothetical protein dJ12208.2 [Homo 
sapiens] [Mus musculus] 


HG1000132N0 160000 gene_predictio 
nl 


gi|6754208|reflNP_034569.1| high mobility 
ejoup box 1 ; high mobility group protein 1 
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Mus musculus] 


HG1000133N0 160000_gene_predictio g 
nl F 


;i|26347765|dbj|BAC37531.1| unnamed protein 
product [Mus musculus] 


HG 1 000 1 34N0_20000_,gene_prediction £ 


3|263S2599|dbj|BAJ3227J3.2| unnainea proiem 
)roduct [Mus musculus] _ 


HG1000134NO_20000 _gene_prediction 

2 


•i<\y> roil OlJUIlD A i~*A A/1 GO 1 1 unnampH TTTfltPITl 

n 26353738|abj|BAC4U4yy. i| unnamea proiem 
>roduct [Mus musculus] 


HG 1 000 1 42N0_1 60000_gene_predictio 
nl 


^|2635373S|abj|BAC4U4yy. 1 1 unnamea proiem 
product [Mus musculus] 


HG 1 000 1 44N0_20000_gene_prediction 
1 1 


5i|66791 08|reflNP_032748.1 1 nucleophosmin 1 ; 
nucleolar protem NO Jo [Mus museums j 


HG1 000 1 45N0_1 60000_gene_predictio 
nl 


^|6677779|ref)NP J)33 1 07. 1 1 nbosomal protein 
J2S; DNA segment, Chr 7, Wayne State 
University 21, expressed [Mus musculus] 


HG 1 000 1 46N0_1 60000_genejpredictio 
nl 


gi)6677779|reflNP_033107.1| nbosomal protem 
L28; DNA segment, Chr 7, Wayne btate 
University 21, expressed [Mus musculus] 


HG 1 000 1 50N0J 0000_gene_prediction 
1 


gi|3717978|emb|CAA73041.1| 5S ribosomal 
protein [Mus musculus] 


HG 1 000 1 52N0_1 60000_gene_predictio 
nl 


gi|11037798|reflNP_06762Ll| djoiactin 5; 
dynactin 4; p25 dynactin subunit [Mus 

musculus] 


HG 1 000 1 6 1 N0_1 60000_gene_predictio 
nl 


gi|21536242|ref]NP_573499.1| glucocorticoid 
induced transcript 1 ; testhymin; 
thymocyte/spermatocyte selection 1 [Mus 
musculus] 


HG1000163N0_160000_gene_predictio 
nl 


gi |208 1 9730|ref)XP_l 293 59.11 hypothetical 
protein XP 129359 [Mus musculus] 


HG1000164NO 5000 gene_predictionl 


gi|20835770|ref|XP 132127. i| similar to ouo 
RIBOSOMAL PROTEIN LI 3 [Mus musculus] 


HG1000165N0 1000 gene_j>redictionl 


gi|26340448|dbj|BAC33887.1| unnamea protein 
product [Mus musculus] 


HG 1 000 1 66N0_1 60000_gene_predictio 
n2 


gi|26353666|dbj|BAC404o3.l| umiamea protem 
product [Mus musculus] 


HG 1 000 1 67N0_1 60000_gene_predictio 
nl 


gi|27369878|rei|NP_76o2U3.l| nypotneucai 
protein 5330403K09 [Mus musculus] 


HG1000171NO_40000 _gene_predictioti 
1 


gi|26354683|dbj|BAC4Uyob.l| unnamea proiem 
product [Mus musculus] 


HG 1 000 1 7 1 N0_1 60000_gene jpredictic 
nl 


> gi|2632583o|abj|r>ACzoo/3.i| unnameu piui^" 
product [Mus musculus] 


HG 1 UUU 1 / D IN U_ 1 ouuuu_gcnc_pi cuiuu^. 
n2 


> $n|26325838|dbj|BAC26673.1| unnamed protein 
product [Mus musculus] 


HG1000176NO 1000 gene_prediction] 


gi|26354216|dbj|BAC40736.1| unnamed protem 
I product [Mus musculus] 
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HG1000176N0_160000_gene _predictio 
nl 



HG1000l77N0_160000_gene jpredictio 
nl 



HG1000178N(M60000_gene _predictio 
nl 



HG1000178NO_160000_gene _predictio 
n2 



HG10Q01S0N0_100Q pene prediction! 



HG1000181N0_10000 pene prediction 



HG1000181N0_160000_gene _predictio 
nl 

HG 1 000 1 83N0_1 60000_gene _predictio 
nl 



HG1000186NO_20000_gene prediction 



Fantom Top Hit Annotation 



gi|26337635|dbj|BAC32503.1| unnamed protein] 
product [Mus musculus] 1 



gi|26337635|dbj|BAC32503.1| unnamed proteinl 
product [Mus musculus] ■ 



gi|20884040|reflXP_134731.1| endothelial 
differentiation, sphingolipid G-protein-coupled 
receptor, 5 [Mus musculusl 



gi|13384830|refINP_079706.1| RIKEN cDNA 
1 1 10066C01 [Mus musculusl 



gi|13384830|reflNP_079706.1| RIKEN cDNA 
1 1 10066C01 [Mus musculusl 



gi|13384730|ref|NP_079640.1| RIKEN cDNA 
1 1 10005A23 [Mus musculusl 



gi|25023031|reflXP_205093.1| similar to 
hypothetical protein FLJ38281 [Homo sapiens] 
Mus musculus] 

gi|26334755|dbj|BAC31078.1| unnamed proteinl 
product [Mus musculus] ' 



HG1000186NO_160000_gene _predictio 
nl 



IG 10001 87N0_20000_gene_prediction 

HG 1 0001 87N0__1 60000_gene_predictio 
n3 



gi|27370150|reflNP_766364.1| hypothetical 
protein D630002G06 [Mus musculus] 



HG1000189N0_1000_gene prediction! 



HG10001S9N0_5000_gene prediction! 



HG1000189NO_1000_g en e prediction2 



HG1000189N0_5000_gene prediction 
IG 1 000 1 95N0_1 0000_gene prediction 



HG 1 000 1 99N0_1 60000_gene_predictio 
nl 



IG1000201N0_10000_gene prediction 



HG1000203NO_5000pene _predictionl 



gi|26342222|dbj|BAC34773.1| unnamed protein 
product [Mus musculus] i 



gi|25024769|ref]XP_207136.1| similar to ORF2 
[Mus musculus domesticus] 



gi|26325734|dbj|BAC26621.1| unnamed protein 
product [Mus musculus] 1 



gi|20879992|ref]XP_140210.1| similar to 
BG.DS0 1759.1 gene product [Drosophila 
melanogaster] [Mus musculusl 

gi|26325734|dbj|BAC26621.1| unnamed protein 
product [Mus musculus] ' 

gi|20879992|ref]XP_140210.1| similar to 
BG:DS0 1759.1 gene product [Drosophila 
melanogaster] [Mus musculusl 



gi|17390530|gb|AAH18231.1| Unknown 
(protein for MGC: 19236) [Mus musculus] 



gi|20824845|ref]XP_l 3 1 963. 1 1 expressed 
sequence C77020 [Mus musculusl 



gi|27477269|ref]XP_209223. 1| similar to 
[Transforming protein RhoC (H9) f Homo 
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^ 31110111 1 Op Xlll /vllliuiaiiuil 


s 


,apiens] 


HG1000204N0 10000_gene_prediction 
1 


ji|26333233|dbj|BAC30334.1| unnamed protein 
product [Mus musculus] 


HG 1 000209NO_1 60000_gene_predictio 
n2 


ji|26326739|dbj|BAC27 113.1| unnamed protein 
Droduct [Mus musculus] 


HG1000215N0 5000 gene_predictionl 


^|27369784|reflNP_766142.1| hypothetical 
3rotein A230053P 1 9 [Mus musculus] 


.... • . 
HG1000215N0 lOOO_gene_predictionl 


gi|6671756|reflNP_031732.1| suppressor of 
cytokine signaling 2; cytokine inducible SH2- 
containing protein 2; high growth; STAT- 
induced STAT inhibitor 2; cytokine-inducible 
SH2 protein 2 [Mus musculus] 


HG 1 00021 9N0 1 0000_gene_prediction 
1 


gi|263289 1 5 |dbj |B AC28 1 96. 1 1 unnamed protein 
product [Mus musculus] 


HG 1 00022 1N0_1 60000_gene _predictio 
nl 


gi|4504255|reflNPJ)02097.1| H2A histone 
? amily, member Z; H2AZ histone [Homo 
sapiens] 


HG 1 00022 lN0_20000_gene_prediction 
1 


gi|l 1360345|pir||T42725 actin binding protein 
ACF7, neural isoform 1 - mouse (fragment) 


HG 1 000223NO_1 60000_gene_predictio 
nl 


gi|l 1360345|pir||T42725 actin binding protein 
ACF7, neural isoform 1 - mouse (fragment) 


HG 1 000225N0_1 60000_gene_predictio 
nl 


gi|2501998S|reflXP_207469.1| similar to 
Retrovirus-related POL polyprotein [Mus 
musculus] 


HG 1 00023 5N0_1 60000_gene jpredictio 
nl 


gi|20137004|ref)NPJ)35320.1| proteasome 
(prosome, macropain) 28 subunit, beta; 
protease (prosome, macropain) 28 subunit, beta 
[Mus musculus] 


HG 1 00023 6N0_1 60000_gene jpredictio 
nl 


gi|15617197|reflNP_077135.1| ATPase, H+ 
transporting, lysosomal 13kD, VI subunit G 
isoform 1; ATPase, H+ transporting, lysosomal 
(vacuolar proton pump) [Mus musculus] 


HG 1 00023 8N0_1 60000_gene_predictio 
nl 


gi|667 1704|reflNP J)3 1 664.1 1 chaperomn 
subunit 7 (eta) [Mus musculus] 


HG 1 00023 8N0_5 O00_gene_predictionl 


gi|667 1 549|ref]NP J)3 1479.1 1 anti-oxidant 
protein 2; acidic calcium-independent 
phospholipase A2; peroxiredoxin 5; 1-Cys Prx 
[Mus musculus] 


HG1000239N0_l60000_gene_predictio 
nl 


gi|667 1 549|reflNP_03 1479. 1 1 anti-oxidant 
protein 2; acidic calcium-independent 
phospholipase A2; peroxiredoxin 5; 1-Cys Prx 

MViliio Tiniio/^iilnc 1 
| IVlUo llltloOtllUoJ 


HG 1 00024 1N0_1 60000_gene_predictio 
nl 


gi|7657357|reflNP_056596.1|nucleosome 
assembly protein 1-like 1; nucleosome 
assembly protein-1 [Mus musculus] 
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HG1000243NO 160000 gene_predictio 
nl 


gi|4759 158|ref|NP_004588. 1 1 small nuclear 
ribonucleoprotein D2 polypeptide 16.5kDa; 
small nuclear ribonucleoprotein D2 polypeptide 
(16.5kD) [Homo sapiens] 


HG1000243NO 160000 gene_predictio 
n2 


gi|8393534|ref)NP_058653.1| high mobility 
group protein 1 7 [Mus musculus] 


HG 1 000245MM O00_gene_predictionl 


gi|8393534|refINP_058653.1| high mobility 
group protein 1 7 [Mus musculus] 


HG1000250N0 160000_gene_predictio 
nl 


gi|12850132|dbj|BAB28604.1| unnamed protein 
product [Mus musculus] 


HG1000252N0 160000 gene_predictio 
nl 


gi|20824845|reflXP_l 31963.1] expressed 
sequence C77020 [Mus musculus] 


HG1000255N0 10000 genejprediction 
1 


gi|17105394|ref]NP_000975.2|ribosomal 
protein L23a; 60S ribosomal protein L23a; 
melanoma differentiation-associated gene 20 
[Homo sapiens] 


HG1000262N0 160000 gene_predictio 
n2 


gi|13385532|ref|NP 080303. 1| RIKEN cDNA 
2700086123 [Mus musculus] 


HG1000263N0 160000 gene_predictio 
nl 


gi|3599320|gb|AAC72793.1| ORF2 [Mus 
musculus domesticus] 


HG 1 000264N0_5000_gene_predictionl 


gi|26360198|dbj|BAB25612.2| unnamed protein 
product [Mus musculus] 


HG 1 000264N0_5000_gene_prediction2 


gi|21624617|ref]NP_081018.1| RIKEN cDNA 
1 1 10007M04 [Mus musculus] 


HG1000265N0 160000 gene_predictio 
nl 


gi|21624617|ref|NP_081018.1| RIKEN cDNA 
1 1 10007M04 [Mus musculus] 


HG 1 000266N0_0_gene_predictionl 


gi|25070241|ref|XP_192786.1| proline rich 
protein expressed in brain [Mus musculus] 


HG1000266NO 160000 genejpredictio 
nl 


gi|12584972|ref]NP_075021.1| lipin 3 [Mus 
musculus] 


HG 1 000267N0_5000_gene_predictionl 


gi|26340094|dbj|BAC33710.1| unnamed protein 
product [Mus musculus] 


HG1000270NO 160000 gene_predictio 
nl 


gi|6679937|reflNP_0321 10.1| glyceraldehyde- 
3-phosphate dehydrogenase [Mus musculus] 


HG 1 00027 1 N0_ 1 0000_gene_prediction 
1 


gi|12844196|dbj|BAB26273.1| unnamed protein 
product [Mus musculus] 


HG 1000271 NO 160000 gene_predictio 
nl 


gi|26345908|dbj|BAC36605.1| unnamed protein 
oroduct [Mus musculus] 


HG1000273NO 160000 genejpredictio 
nl 


gi|26345908|dbj|BAC36605.1| unnamed protein 
jroduct [Mus musculus] 


HG1000?Q5NO 160000 trene nrf»Hi^tiVi 

nl 


n|20888943|ref|XP 129258.1) cDNA sequence 
\F233884 [Mus musculus] 


HG1000296N0_160000_gene_predictio 
nl 


n|21313266|ref]NP 080089.1| RIKEN cDNA 
1 200003006 [Mus musculus] 
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HG1000299N0_1 60000_gene_predictio 

nl 

HG1000300NQ_lG000_gene_prediction 

1 



gi|25054735|reflXP_l 92839.1 1 ATPas, class O, 
type 9B [Mus musculus] 



HG1006306NO Q_gene_predictionl 



gi|25024769|ref|XP_207136.1| similar to ORF2 
Mus musculus domesticus] 



IHG 1 000306NQ 0 gene prediction2 



HG1000312N0_160000_gene_predictio 



nl 



^mnnfni4N0 1000 gene prediction! 



j|4506283|reflNP_003454.1| protein tyrosine 
phosphatase type IV A, member 1; Protein 
tyrosine phosphatase IVA1 [Homo sapiens] 



HG 1 0003 1 5N0_1 60000_gene_predictio 
nl 

HG1000330N0_160000_gene_predictio 
n2 



HG1000330N0_160000_gene_predictio 
n4 I 



HG1000332N0_10000_gene_prediction 



1 



HG1 000341N0 5000_gene_predictionl 



HG 1 00034 1N0_1 0000_gene_prediction 



G 1 000337N0 1 000_gene_prediction 



,i|4506285|reflNP_003470.11 protein tyrosine 
phosphatase type IV A, member 2, isoform 1; 
jrotein tyrosine phosphatase IV A; protein 
tyrosine phosphatase IVA2; phosphatase of 
regenerating liver 2 [Homo sapiens] 



gi| 1 28603 88|dbj |BAB3 1 939. 1 1 unnamed protein 
product [Mus musculus] 



gi|26344091|dbj|BAC35702.1| unnamed protein 
product [Mus musculus] 



gi|20987322|gb|AAH30185.1| Unknown 
rprotein for MGC:29401) [Mus musculus] 



gi|4506725|reflNP_000998.1| ribosomal protein 
S4, X-linked X isoform; 40S ribosomal protein 
S4 X isoform; ribosomal protein S4X isoform; 
single-copy abundant mRNA; cell cycle gene 2 
[Homo sapiens] 



HG1000353N0_160000_gene_predictio 
nl 



gi|26332837|dbj|BAC30136.1| unnamed protein 

p roduct [Mus musculus] - 

gi|17157989|ref|NP_473384.1|Musashi 
homolog 2 (Drosophila) [Mus musculus] 



HG1000357N0_20000_gene_prediction 



gi|25021483|reflXP_207941.1| similar to 
Retrovirus-related POL polyprotein [Mus 
musculus] 



HG1000358NO 5000_gene._predictionl 



HG1000359N0_160000_gene_predictio 
nl 



HG1000363N0_160000_genejpredictio 
nl 



gi|27372319|dbj|BAC53724.1| Piccolo [Mus 
musculus] 



gi|3599320|gb|AAC72793.1| ORF2 [Mus 
musculus domesticus] 



gi|3599320|gb|AAC72793.1| ORF2 [Mus 
musculus domesticus] 



197 



WO 2005/005597 



PCT/US2003/027106 



FP ID 



HG1000364N0_160000_gene_predictio 



nl 



HG 1 00O367N0_l 60000_gene_predictio 



nl 



HG1 000379N0_160000_gene_predictio 



nl 



HG 1 000390N0_1 0000_gene_prediction 



HG1000390N0_S000_gene .prediction 



HG1000391N0 
nl 



•_160000_gene_predictio 



HG1000396N0_ 
n2 



1 60000_gene_predictio 



Fantom Top Hit Annotation 



gi|19484126|gb|AAH25846.1| Unknown 
(protein for MGC:32383) [Mus musculus] 



gi|13928676|ref]NP_l 13687.1| proline rich 
protein 2 [Mus musculus] 



gi|20863632|ref]XP_164160.1| hypothetical 
protein XP_164160 [Mus musculusl 



gi|3599320|gb|AAC72793.1| ORF2 [Mus 
musculus domesticus] 



gi|20892585|ref]XP_147977. 1 1 RIKEN cDNA 
2610001E17 [Mus musculusl 



gi|20892585|ref|XP_147977.1| RIKEN cDNA 
26 10001 El 7 [Mus musculus] 



gi|26330368|dbj|BAC28914.1| unnamed protein 
product [Mus musculus] 1 



HG1000401N0_10000_gene_prediction 



1 



HG1000407NO 
nl 



>_1 60000_gene_predictio 



gi|12853695|dbj|BAB29819.1| unnamed protein 
product [Mus musculus] 



HG1000408N0 
n2 



»_1 60000_gene_predictio 



HG10O0414N0_ 
n2 



gi|25029560|ref]XP_203691 . 1 1 similar to 
PROBABLE POL POLYPROTEIN [Mus 
musculus] 



1 60000_gene_predictio 



gi|26326871|dbj|BAC27179.1| unnamed protein 
product [Mus musculus] 



HG1000416N0 
nl 



»_1 60000_gene_predictio 



HG100042SN0 
nl 



»_1 60000_genejpredictio 



gi|2090206 1 |ref]XP_147959. 1 1 hypothetical 
protein XP 147959 [Mus musculus] 



HG1000429N0_ 
nl 



1 60000_gene_predictio 



gi|25032567|reflXP_207391.1| similar to ORF2 
Mus musculus domesticus] 



HG1000431N0_20000_gene_prediction 



gi|25022040|ref|XP_204233.1| similar to ORF2 
Mus musculus domesticus] 



gi|26339864|dbj|BAC33595.1| unnamed protein 
product [Mus musculus] 



HG1000435N0_160000_gene_predictio 



nl 



HG1000441N0_160000_gene_predictio 



gi|8394057|ref)NP_058565.1| low density 
lipoprotein receptor-related protein 4; low 
density lipoprotein-related protein 4; Low 
Density Lipoprotein Receptor Related Protein 
A \ corin [Mus musculus] 



nl 



n2 



G1000441N0_160000_gene_predictio 



gi|26340972|dbj|BAC34148.1| unnamed protein 
product [Mus musculus] ' 



nl 



Gl 000446N0_1 60000_gene_predictio 



gi| 1 2836479|dbj|BAB23675. 1 1 unnamed protein 
product [Mus musculus] ' 



gi|25029827|ref]XP_207226.1| similar to ORF2 
Mus musculus domesticus] 



n2 



G 1 000446N0_1 60000_gene_predictio 



gi|25031497|reflXP_207552.1| similar to 
Retrovirus-related POL polyprotein fMus 
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n 


lusculus] 


HG1000449N0_160000_gene_predictio g 
n2 11 


;i|3599320|gb|AAC72793.1| ORF2 [Mus 
ausculus domesticus] 


1 

HG100045 1N0_1 60000_gene_predictio 
nl 


ri|25054021|reflXP_192811.1| similar to 
transmembrane protease, serine 2 
Epithehasin) (Plasmic transmemorane pruiciu 
>Q [Mus musculus] 


HG1000455N0_100Q0_genejprediction 


n|20846744|rei|XP_l 44090. 1| similar 10 
lypothetical protein FLJ12457 [Mus musculus] 


HG1000461N0ll0000_gene_prediction 


Ji|20824899|ref|XP_l 44255. 1| nypotneucai 
aroteinXP 144255 [Mus musculus] 


HG1000474N0 5000 gene_predictionl ] 


gi|12853695|dbj|BAB296ly.l| unnamea proiem 
product [Mus musculus] 


HG1000476N0 1000 _gene_predictionl 


gi|12834707|dbj|BAB2301 1 . 1 1 umiamea protein 
product [Mus musculus] 


HG1 000489N0_1 60000_gene_predictio 
nl 


no blast hit 


HG 1 000499N0_1 60000_genejpredictio 
nl 


gi|3599320|gb|AAC72793,l| ORr2 LMus 
musculus domesticus] 


HG 1 000500N0_1 60000_gene_predictio 
nl 


gi|20912903|ref|XP_1266o3.1| KilsJiJN cujna 
241 0 1 54 J 1 6 [Mus musculus] 


HG 1 000505N0_1 60000_genejpredictio 
nl 


gi|25044951|reflXP 195302.1| sunilarto 
olfactory receptor MOR256-23 TMus musculus] 


HG1 000509N0_1 0000_genejprediction 
1 


gi|26334721|dbj|BAC31061.1| unnamed protein 
product [Mus musculus] 


HG 1 0005 1 0N0_1 60000_gene_predictio 
nl 


gi|12834707|dbj|BAB2301 1.1| unnamea protein 
product [Mus musculus] 


HG 1 0005 1 3N0_1 60000_gene_predictio 
nl 


gi|12859663|dbj|BAB31727.1| unnamed protein 
product [Mus musculus] 


HG 1 0005 1 9N0_1 60000_gene_predictio 
nl 


gi|l 19146|sp|P20001|EFl 1_CRIGR Elongation 
factor 1-alpha 1 (EF-l-alpha-1) (Elongation 
factor 1 A-l) (eEFlA-1) (Elongation factor Tu) 
(EF-Tu) 


HG 1 00052 1N0_1 60000_gene_predictio 
nl 


gi|2495301 |sp|Q63934|BR3B_MOUSE Brain- 
specific homeobox/POU domain protein jd 
(BRN-3B) (BRN-3.2) 


HG 1 0005 24N0_1 60000_gene_predictio 
nl 


gi|21 280325|dbj|BAB96760. 1 1 type XXVI 
collagen [Mus musculus] 


Tt /-i i AAAcinxTA onnnn o-pnp nrpdictior 
HCj 1 UUU!)3UiNU — ZUUUU_gciic_jJi cuiv/uuu 

i 


gi|6679921|ref|NP 032102.1| gamma- 
l aminobutyric acid (GABA-A) receptor, subumt 
iho 2 [Mus musculus] 


HG 1 000530N0_1 60000_gene_predictic 
n2 


> gi|23622684|ref|XP_l 56394.2| expressed 
sequence AL023001 [Mus musculus] 
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HG1000534NO 160000 gene_predictio 
nl 


• 


HG1000545NO 160000 gene_predictio 
nl 


gi|3599320|gb|AAC72793.1| ORF2 [Mus 
musculus domesticus] 


HG1000549N0 160000 gene_predictio 
nl 


gi|26341288|dbj|BAC34306.1| unnamed protein 
product [Mus musculus] 


HG1000549N0 160000 gene_predictio 
n2 


gi|3599320|gb|AAC72793.1| ORF2 [Mus 
musculus domesticus] 


HG1000549N0 160000 gene_predictio 
n3 


gi|21312126|ref|NP 081 135.1] RIKEN cDNA 
1 1 1 0068E1 1 [Mus musculus] 


HG1000553N0 160000 gene_predictio 
nl 


gi|3599320|gb|AAC72793.1| ORF2 [Mus 
musculus domesticus] 


HG1000560NO 160000 gene_predictio 
n2 


gi|25032555|ref)XP_207412.1| similar to 
Retrovirus-related POL polyprotein [Mus 
musculus] 


HG1000562N0 160000 gene_predictio 
nl 




HG1000566N0 40000 gene_prediction 
1 


gi|20856064|reflXP_l 51 615.1 1 hypothetical 
protein XP 151615 [Mus musculus] 


HG1000566N0 160000 gene_predictio 
nl 




HG1000582N0 160000 gene_predictio 
nl 


gi|7656873|refINP_056579.1| RIKEN cDNA 
57305 83K22 gene [Mus musculus] 


HG1OO0598N0 160000 gene_predictio 
nl 


gi|45 1 226 1 |dbj|B AA75227. 1 1 neurochondrin-2 
Mus musculus] 


HG1000606N0_20000_gene_prediction 


gi| 1 9527094|ref]NP_598640. 1 1 expressed 
sequence AI327031 [Mus musculus] 


HG1000607NO 160000 gene_predictio 
nl 


gi|25058382|reflXP_2063 1 8. 1 1 hypothetical 
protein XP 206318 [Mus musculus] 


HG1000608NO 20000 gene_prediction 
1 


gi|3599320|gb|AAC72793.1| ORF2 [Mus 
musculus domesticus] 


HG 1 0006 1 6N0_1 000_gene_prediction 1 


gi|26387941|dbj|BAC25633.1| unnamed protein 
product [Mus musculus] 


HG1000622N0 160000 gene_predictio 
n2 




HG1000623N0 160000 gene_predictio 
nl 


gi|209041 29|ref|XP_l 55605. 1 1 hypothetical 
protein XP l 55605 [Mus musculus] 


HG1000624N0 160000 gene_predictio 
nl ~ , 


gi|13542693|gb|AAH05553.1| putative chloride 
channel (similar to Mm Clcn4-2) [Mus 
musculus] 


HG1000625N0 160000 genejpredictio 
nl 


gi|20901495|reflXP 140099.1| RIKEN cDNA 
5130404H23 [Mus musculus] 


HG1000628N0 40000 gene_prediction 
1 


?i|3599320|gb|AAC72793.1| ORF2 [Mus 
misculus domesticus] 
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HG1000628N0_20000_gene_prediction 
1 



gi|26339720|dbj|BAC33523.1| unnamed protein 
product [Mus museums] 



HG1000638N0 5000 _gene predictionl 



gi|3599320|gb|AAC72793.1| ORF2 [Mus 
musculus domesticus] 



HG1000642N0_160000_gene_predictio 
nl 



HG1000646N0_160000_gene_predictio 

HG1 000649N011 60000_gene_predictio 
nl 



gi|3599320|gb|AAC72793.1| ORF2 [Mus 
musculus domesticus] 



gii25049717|reflXP_l 49640.2| similar to gene 
Dbp73D protein - fruit fly (Drosophila 
melanogaster) fMus musculus] 



HG1 000650N0_160000_gene_predictio 
nl 



gi|3599320|gb|AAC72793.1| ORF2 [Mus 
musculus domesticus] 



HG 1 000652N0_1 60000_gene_predictio 
n2 



HG1000656N0_1 60000_gene_predictio 
nl 



HG1000656N0_160000_gene_predictio 
n2 



gi|26377673|dbj|BAC25377. 1 1 unnamed protein 
product [Mus musculus] 



gi|13384666|reflNP_0795S3.1| nuclear receptor 
binding factor 2 [Mus musculus] 



gi|25050704|reflXP_133465.2| RIKEN cDNA 
24 1 0004H02 [Mus musculus] 



HG 1 000659N0_20000_gene_prediction 
1 



gi|25050704|ref|XP_133465.2| RIKEN cDNA 
2410004H02 [Mus musculus] 



HG1000661N0_20000_gene_prediction 
1 



gi|26333733|dbj|BAC30584.1| unnamed protein 
product [Mus musculus] 1 



,HG 1 000664N0_1 60000_gene_predictio 
nl 



gi|27372319|dbj|BAC53724.1| Piccolo [Mus 
musculus] 



HG 1 000670N0_1 60000_gene_predictio 
nl . - 



HG1 00068 5N0_1 60000_gene_predictio 
n2 



HG1000690N0_20000_gene_prediction 



1 



<n|66S0195|ref]NP_032255.1|histone 
deacetylase 2; DNA segment, Chr 10, Wayne 
State University 179, expressed [Mus 
musculus] 



HG1000690N0_20000_gene_prediction 



gi|26340662|dbj|BAC33993.1| unnamed protein 
product [Mus musculus] 



HGl000696N0_20000_gene_prediction 

J_ . 

HG1000696N0_40000 _gene_prediction 



gi|26340662|dbj|BAC33993.1| unnamed protein 

pr oduct [Mus musculus] , 

gi|26326171|dbj|BAC26829.1| unnamed protein 
product [Mus musculus] 



HG 1 000697NO_1 60000_genejpredictio 

nl 

HG 1 000700N0_1 60000_gene_predictio 

n2 



gi|25024387|ref[XP_207341 .1 1 hypothetical 
protein XP_207341 [Mus musculus] 
gi|263512791dbj|BAC39276.1| unnamed protein 
product [Mus musculus] 



HG i Q00704N0 160000 gene predictio 



gi|21644579|reflNP 660253.11 Williams- 
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nl 


Beuren syndrome critical region gene 17 [Mus 
musculus] 


HG1 00071 1N0 20000 gene_prediction 
1 


gi|23273683|gb|AAH37239.1| Similar to 
BCL2-associated athanogene 4 [Mus musculus] 


HG1000738N0 160000 gene_predictio 
nl 


gi|12856848|dbjlBAB30802.1| unnamed protein 
product [Mus musculus] 


HG1000739N0 160000 gene_predictio 
nl 


gi|26339470|dbj|BAC33406.1| unnamed protein 
product [Mus musculus] 


HG1000739N0 160000 gene_predictio 
n2 


gi|3599320|gb|AAC72793.1| ORF2 [Mus 
musculus domesticus] 


HG1000740N0 10000_gene_prediction 
1 


gi|3599320|gb|AAC72793.1| ORF2 [Mus 
musculus domesticus] 


HG1000743N0 160000 gene_predictio 
nl 


gi|23601536|refjXP_130965.2| Nice-4 protein 
homolog [Mus musculus] 


HG1000779N0 160000 genejpredictio 
nl 


gi|2627027|dbj|BAA23475.1|Ftp-l [Mus 
musculus] 


HG1 00078 1N0 160000 genejpredictio 
nl 


gi|25023334|reflXP_204722.1| similar to 
formin [Mus musculus] 


HG1 00078 1N0 160000 genejpredictio 
n2 


gi|26350877|dbj|BAC39075.1| unnamed protein 
product [Mus musculus] 


HG1000786N0 160000 gene_predictio 
nl 


gi|25023581|ref]XP__207103.1| similar to 
Retrovirus-related POL polyprotein [Mus 
musculus] 


HG 1 000788NO_1 000_gene_predictionl 


gi|26340832|dbj|BAC3407S.l| unnamed protein 
product [Mus musculus] 


HG1000799N0 20000 gene_prediction 
1 


gi|20847912|ref]XPJ44610.1| similar to 
KIAA 1904 protein [Homo sapiens] [Mus 
musculus] 


HG1000808NO 160000 gene_predictio 
nl 


gi|26345960|dbj|BAC3663 1 . 1 1 unnamed protein 
product [Mus musculus] 


HG1000817NO 160000 gene_predictio 
nl 


gi|20882231|ref|XP_139203.1| similar to 
KIAA0858 protein [Homo sapiens] [Mus 
musculus] 


HG1000822NO 20000 gene_prediction 
1 


gi|13242237|ref|NP__077327.1| Heat shock 
cognate protein 70; heat shock 70kD protein 8 
[Rattus norvegicus] 


HG1000824N0 160000 genejpredictio 
nl 


gii6680195|ref)NP^032255.1|histone 
deacetylase 2; DNA segment, Chr 10, Wayne 
State University 179, expressed [Mus 
musculus] 


HG1000824N0 10000_gene_prediction 
1 


gi|20883564|ref]XP_l 528 1 5.1 1 hypothetical 
protein XP_152815 [Mus musculus] 


HG1000839N0 160000 gene__predictio 
nl 


gi|20883564|ref]XP_152815.1| hypothetical 
protein XPJ52815 [Mus musculus] 
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HG1000842N0J 60000_gene_predictio 
n2 



gi|3599320|gb|AAC72793.1j ORF2 [Mus 
musculus domesticus] 



HG1000S69N0_l60000_gene_predictio 
nl 



gi|6715564|reflNP_032607.1| melanoma 
antigen. 80 kDa [Mus musculus] 



HG1000870N0_160000_gene_predictio 
nl 



HG1000870N0_160000_gene_predictio 

b2 ■ 

HG1000878N0_20000_gene_prediction 

1 



gil20881 174|reflXP_147S75.1| hypothetical 
protein XP 147875 [Mus musculus] 



gi|27369942|reflNP_766246. 1 1 hypothetical 
protein 953005 1F04 [Mus musculus] 



gi|27369942|reflNP_766246.1| hypothetical 
protein 953005 1F04 [Mus musculus] 



| H G1000878N0 20000_gene prediction gi|27369942|ref|NP_766246.1| hypometical 
HOluuus/6iNu_^ _& _f i protein 953005 1F04 [Mus musculus] 

IHG1000904N0 160000 g e^enictio U27369942MNP_766246.ll hypothetical 
^,iuuuyu^iNu_io _s _t- protein 953Q051F04 [Mus musculus] 

HG1000904N0 40000' gen TpTe^ction gil35993201gblAAC72793.1| ORF2 [Mus 
I - "~ musculus domesticus] „ 



gi|3599320|gb|AAC72793.1| ORP2 [Mus 
IhG1000906N0 ™00 pene prediction! muscul us domesticus] 



HG1000906N0 160000 gen ej7edictio gi)20836822|ref|XP_l 30277. 1| similar to 
H01UUUyubJNU_iou _ 6 _f [ piakophilin 4 (p007l) [Mus musculus] 



" musculus domesticus] 

nl 1 — 



j - " product [Mus musculus] 

fel^iNO 160000 .g^5^^59»20|^AAC72793J|OM^u8 

"* "~ musculus domesticus] 

r ~ [product [Mus musculus] . 



■ U|22507385|reflNP_081019.1| RIKEN cDNA 

LniOO0959N0 5000 r™* predicti on! 1 1 100 14F 12 [Mus musculus] _ 

pi00^9N0^^^ teib^8"5|reflNP_081019.1| RIKENc0NaT 

HG1000990N0 500 0 gene predict^ 1 1100 14F 12 [Mus musculus] . 

1 = " |gi| 1 0946762|reflNP_067382. 1 1 triggenng 

I receptor expressed on myeloid cells 3; 

HG1000994N0 10000_gene_prediction triggering receptor expressed on monocytes 3 
I " [f Mus musculus] 



felw^O 160000 genepredictio^ 
™ - [product [Mus musculus] 



iS^i 10000 ^j^^^55\15m^^M^^^ 
- - Product [Mus musculus] 



S^bl^blFo 160000 gen ^re6ictio\^^m^AB3023BA\^^ protein 
"j ■ ~ "" [product [Mus musculus] 
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HG100100iN0_0_gene_predictionl 


gi|263373'85|dbj|BAC32378.1| unnamed protein 
product [Mus musculus] 


HG1001002NO 160000 gene_predictio 
nl 


gi|27370034|reflNP_766297. 1 1 hypothetical 
protein A530025J20 [Mus musculus] 


HG1001003NO_0 gene_predictionl 


gi|20348159|ref]XP 111588.1| similar to 
TRAV9D-3 [Mus musculus] 


HG1001007N0 160000 gene__predictio 
n2 


gi|27370034|reflNP_766297.1| hypothetical 
protein A530025J20 [Mus musculus] 


HG1001011N0 160000 gene_predictio 
nl 


gi|13097000|gb|AAH03291.1| Similar to 
hypothetical protein FLJ10342 [Mus musculus] 


HG1001011N0_160000_gene_predictio 
n2 


gi|26336525|dbj|BAC31945.1| unnamed protein 
product [Mus musculus] 


HG1001014N0 160000 gene_predictio 
nl 


gi|25047957|reflXP_l 30582.2| similar to 
hypothetical protein MGC14161 [Homo 
sapiens] [Mus musculus] 


HG1001014N0_5000_gene_predictionl 


gi|26337385|dbj|BAC32378.1| unnamed protein 
product [Mus musculus] 


HG1001017N0 160000 gene_predictio 
nl 


gi|26337385|dbj|BAC32378.1| unnamed protein 
product [Mus musculus] 


HG1001020NO 160000 gene_predictio 
nl 


gi|25019831|reflXP 207463. 1| similar to 
CD59B [Mus musculus] 


HG1001024N0 160000 gene_predictio 
nl 


gi|26338976|dbj|BAC33159.1| unnamed protein 
product [Mus musculus] 


HG1001024N0 160000 gene_predictio 
n2 


gi|2091 5 148|ref]XP_149841 . 1 1 hypothetical 
protein XP_149841 [Mus musculus] 


HG1001031N0 160000 gene_predictio 
nl 


gi|209 1 5 1 4S|ref]XP_l 4984 1 . 1 1 hypothetical 
protein XP 149841 [Mus musculus] 


HG1001035N0_5000 gene_predictionl 


gi|25071690|ref|XP_193591.1| hypothetical 
protein XP_193591 [Mus musculus] 


HG1001043N0 160000 gene_predictio 
nl 


gi|26347249|dbj|BAC37273.1| unnamed protein 
product [Mus musculus] 


HG 1 00 1 046N0_5000_gene_prediction 1 i 


gi|66787 1 4|ref|NP_032537. 1 1 lymphoid- 
restricted membrane protein [Mus musculus] 


HG1001046NO 160000 gene_predictio 
nl 


gi|25048969|refjXP_143803.3| similar to 
bA401.1 (novel protein) [Homo sapiens] [Mus 
musculus] 


1 

HG1001047N0_1000_gene_predictionl ] 


gi|25021 180|reflXP_2079 17.1| similar to RNP 
aarticle component [Mus musculus] 


HG1001O48N0 160000 gene_predictio j 
nl t 


?i|26353724|dbj|BAC40492.1| unnamed protein 
5roduct [Mus musculus] 


t 

HG1001048NO_160000 gene_predictio 1 
n2 " f 


p|20343845|ref|XP_l 09652. 1| similar to 
lypothetical protein FLJ25217 [Homo sapiens] 
Mus musculus] 


HG1001144N0 20000 gene prediction t 


ril20346197|reflXP 1 10161.11 RAN binding 
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HG1001148N0 160000 gene_predictio gi|3599320|gb|AAC72793.1| ORF2 [Mas 
._ • - . musculus domesticus] 

XiL 3 — — — - — 



iFantom TopHit Annotation 

pr otein 1 [Mus musculus] 



HG1001172N0 160000 gene_predictio gi|2633962Sldbj|BAC334S5.11 unnamed protein 
^ - [product [Mus musculus] 



HG1001 172N0 20000_gene_prediction gi|22122489|refINP_666128.1| hypothetical 
HGlUUli/ziNu_z W _b _f protein MGC38936 [Mus musculus] 



1HG1001187NO 16 0000 genejpredictio gil263407061dbjlBAG34015.11 unnamed protein 
^ _ |product [Mus musculus] 



|gi|l 8497290|reflNP_084056.1 1 protein kinase 
HG1001192N0 160000 gene jredictio rafl; murine sarcoma 3611 oncogene 1; 
HOluuiiyzrsu.io _ B sarcoma 361 1 onco gene fMus musculus] 

nl 1 — — " " - 



HG1001194N0 1600 00_gene_ P redictio gi|3599320|gb|AAC72793.1| ORF2 [Mus 

" musculus domesticus] 

nl * — __— 



HG1001199N0 160000 gene _predictio gi|20837732|ref|XPJ ,32241 1\ Hypothetical 
HOiuuiiyyiNu_i _ B _r |proteinXP_l 32241 [Mus musculus] 



LlOOl^ONO 160000 gene jredictio gi|2007106S|gb|AAH2734l.l| Similar to 
H01UU1^-uinu_i 0 _e _r [ elongation factor G2 [Mus musculus] 



r HG1001223N0 160000 gene _ predictio gi|20908735|reflXP_l 22598 J\ similar to helix 
HG100122JJNU_iouuu V; _s _f destabilizing prot ein - rat fMus musculus] 

nl 1 ~~ ~~~~ 



iGlb01229N0 160000 g^re^fe^^ to ORF2 

IT - ~ | [ Mus musculus domesticus] 



|hG100123QN0 5000 _gene prediction! 



gi|6754206|reflNP_034568.1| hexokinase l; 
downeast anemia [Mus musculus] 



HG1001235N0_160000_gene_predictio 

nl . 

HG 1001 23 5N0_1 0000 _gene_prediction 

1 



gi|12857205|dbj|BAB30930.1| unnamed protein 

p roduct [Mus musculus] ' 

gi|2 1 7039 1 8|reflNP_663438 . 1 1 hypothetical 
protein BC024118 [Mus musculus] 



HG1001235NO_20000 _jgene_prediction 

U. ; _ 

HG 1 00 1 235N0_1 60000_gene_predictio 
n2 



HG1001235N0_l60000_gene_predictio 
n3 



HG 1 00 1 260N0_1 60000_gene_predictio 
nl 



HG 1 00 1 260N0_40000 _gene_prediction 
1 



gi|26339338|dbj|BAC33340.1| unnamed protein 
product [Mus musculus] 



gi|263393381dbj|BAC33340.1| unnamed protein 
product fMus musculus] 



gi|26340904|dbj|BAC34114.1| unnamed protein 
product [Mus musculus] 



gi|26327795|dbj|BAC27638.1| unnamed protein 
product [Mus musculus] 



gi|8922328|reflNP_0605 17. 1 1 hypothetical 
protein FLJ10290 [Homo sapiens] 



HG1001264N0_160000_gene_predictio 
nl 



[HG1001274N0 160000 gene predictio 



gi|8922328|reflNP_0605 1 7.1 1 hypothetical 
protein FLJ 10290 [Homo sapiens] 



igil2638319SjdbilBAC25520.il unnamed protein 
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nl 


product [Mus musculus] 


HG1001284N0 160000 gene_predictio 
nl 


gij3599320|gbjAAC72793.1i ORF2 [Mus 
musculus domesticus] 


HG1001284N0 160000 genejpredictio 
n2 


gi|26326843|dbj|BAC27165.1| unnamed protein 
product [Mus musculus] 


HG1001292N0 160000 gene_predictio 
nl 


gi|26326843|dbj|BAC27165 1 1 unnamed protein 
product [Mus musculus] 


HG1001302N0 160000 gene_predictio 
nl 


gi|13097342|gb|AAH03421 11 Similar to 
ATPase, H+ transporting, lysosomal (vacuolar 
proton pump) 31kD [Mus musculus] 


HG1OO1313N0 160000 gene_predictio 
nl 


gil 1 285263 1 |dbi|BAB29486 1 1 unnamed nrotein 
product [Mus musculus] 


HG1001323N0 160000 gene_predictio 
nl 


gi|25053 141 IreflXP 1 93739 1 1 similar to 

I I f -"^ ' ■— ' • -*- J OA.X1XjI.1CU. 

betaine-homocysteine methyltransferase 
[Rattus norvegicus] [Mus musculus] 


HG1001328NO_5000 gene_predictionl 


gi|26347687|dbj|BAC37492.1| unnamed protein 
product [Mus musculus] 


HG10O1328N0 40000 gene_prediction 
1 


gi|2635291 8|dbj|BAC40089.1 1 unnamed protein 
product [Mus musculus] 


HG 1 00 1 33 lNO_0_gene_predictionl 


gi|3599320|gb|AAC72793.1| ORF2 [Mus 
musculus domesticus] 


HG1001335N0 160000 gene_predictio 
nl 


gi|20381292|gb|AAH27770.1| stromal cell 
derived factor receptor 2 [Mus musculus] 


HG1001335N0 160000 gene_predictio 
n2 


gi|2193S70|dbj|BAA20419.1| reverse 
transcriptase [Mus musculus] 


HG1001348N0 160000 gene_predictio 
nl 


gi|2193870|dbj|BAA20419.1| reverse 
transcriptase [Mus musculus] 


HG1001349N0 160000 gene_predictio 
nl 


gi|20846538|reflXPJ 50033. 1 1 hypothetical 
protein XP 1 50033 [Mus musculus] 


HG1001354N0 160000 gene_predictio 
nl 


gi|7305215|reflNP 038599.11 kinase sunnressor 
of ras [Mus musculus] 


HG1001361N0 160000 genejpredictio 
nl 


gi|6678690|ref|NP_032525.1| LIM homeobox 
protein 5; LIM homeo box protein 5 [Mus 
musculus] 


HG1001376N0 160000 gene_predictio 
nl 


gi|2034590 1 |re£)XP_l 09824. 1 1 hypothetical 
protein XP 109824 [Mus musculus] 


HG 1 00 1 376N0_5000_gene_prediction 1 


gi|27261816|ref]NP 080861. 1| RIKEN cDNA 
C530005J20 [Mus musculus] 


HG1001376N0 20000 gene_prediction 
1 


gi|27261816|ref|NP 080861. 1| RIKEN cDNA 
C530005J20 [Mus musculus] 


HG 1 00 1 3 76N0_5000_ j gene_prediction2 


?i|27261816|ref]NP 08086 1.1| RIKEN cDNA 
C530005J20 [Mus musculus] 


■ 

HG1 001 376N0_5000_gene_prediction3 


?i|27261816|ref]NP 080861. 1| RIKEN cDNA 
3530005120 [Mus musculus] 
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HG1001417NO_160000 _gene_predictio g 
nl ( 


5i|27261816|ref|NP_08O861.1| KllvbiN ouina 
:530005 J20 rMus musculus] 


|HG1001417N0 1000 _gene_predictionl 


p|26349767|dbj|BAC35DZi.l| unnamea proiem 
product [Mus musculus] 


HG 100141 7N0_1 60000_gene__predictio 
In2 


5i|263497o7|abj|BAU3o^z d. i \ unndmeu pujicm 
product [Mus musculus] 


HG1001417NO 160000 gene_predictio 
n3 


gi|26349767|dbj|B AC38523.1| unnamed protein 
product [Mus musculus] 


IHG1001436NO' 5000 gene__predictionl 


gi|26349767|dbj|BAC38523.1| unnamed protein 
product [Mus musculus] 


HG 1 00 1 436N0_20000_gene_prediction 
11 • 


gi|20987280|gb|AAH29643.1| unKnown 
(protein for MGC:25768) [Mus musculus] 
gi|25051637|reflXP_l 94491.1| RlKrJN cujna 
1 1 10053F02 [Mus musculus] 


HG1001436N0_160000_gene_predictio 
nl 


HG 1 00 1 439N0_1 60000_gene_predictio 
nl 


gi|25051637|reflXP_194491.1| RlrLbN cluna 
1 1 10053F02 [Mus musculus] 


HG1 00 1 484N0_1 60000_gene_predictio 
nl 


gi|6753290|reflNP_033943.1| calsequestrin 1 
[Mus musculus] 


HG 1 00 1 485N0_1 0000_gene_prediction 
1 


gi|25029827|reflXP_207226.1| similar to <jktz 
[Mus musculus domesticus] 


HG 1 00 1 5O0N0_l 6000Q^gene_predictio 
nl 


gi|3599320|gb|AAC72793.1| ORF2 [Mus 
musculus domesticus] 


HG 1 00 1 5OON0_l 60000 _gene_predictio 
n2 


gi|6679108|reflNPJ)32748.1| nucleophosmin 1, 
nucleolar protem N038 [Mus musculus j 


HG 1 00 1 5 08N0_1 60000_gene_predictio 
nl 


gi|25029928|reflXP_207257.1| similar to 
Retrovirus-related POL polyprotein [Mus 
musculus] 




gi|203406S3|ref|XP 1 10361 .1| similar to 
phospholipase C beta 2 [Rattus norvegicus] 
[Mus musculus] 1 



207 



WO 2005/005597 

PCT/US2003/027I06 

Examples 

[0602] The examples, which are intended to be purely exemplary of the 
invention and should therefore not be considered to limit the invention in any way, 
also describe and detail aspects and embodiments of the invention discussed above. 
The examples are not intended to represent that the experiments below are all or the 
only experiments performed. Efforts have been made to ensure accuracy with respect 
to numbers used (e.g., amounts, temperature, etc.) but some experimental errors and 
deviations should be accounted for. Unless indicated otherwise, parts are parts by 
weight, molecular weight is weight average molecular weight, temperature is in 
degrees Centigrade, and pressure is at or near atmospheric. 

[0603] While the present invention has been described with reference to the 
specific embodiments thereof, it should be understood by those skilled in the art that 
various changes may be made and equivalents may be substituted without departing 
from the true spirit and scope of the invention. In addition, many modifications can 
be made to adapt a particular situation, material, composition of matter, process, 
process step or steps, to the objective, spirit and scope of the present invention. All 
such modifications are intended to be within the scope of the claims appended hereto. 

[0604] Additional objects and advantages of the invention will be set forth in 
part in the description which follows, and in part will be obvious from the description, 
or may be learned by practice of the invention. The objects and advantages of the 
invention will be realized and attained by means of the elements and combinations 
particularly pointed out in the appended claims. Moreover, advantages described in 
the body of the specification, if not included in the claims, are not per se limitations to 
the claimed invention. 

[0605] It is to be understood that both the foregoing general description and 
the following detailed description are exemplary and explanatory only and are not 
restrictive of the invention, as claimed. Moreover, it must be understood that the 
invention.is not limited to the particular embodiments described, as such may, of 
course, vary. Further, the terminology used to describe particular embodiments is not 
intended to be limiting, since the scope of the present invention will be limited only 
by its claims. 

[0606] With respect to ranges of values, the invention encompasses each 
intervening value between the upper and lower limits of the range to at least a tenth of 
the lower limit's unit, unless the context clearly indicates otherwise. Further, the 

208 



PCT/US2003/027106 

WO 2005/005597 

invention encompasses any other stated intervening values. Moreover, the invention 
also encompasses ranges excluding either or both of the upper and lower limits of the 
range, unless, specifically excluded from the stated range. 

[0607] Unless defined otherwise, the meanings of all technical and scientific 
terms used herein are those commonly understood by one of ordinary skill in the art to 
which this invention belongs. One of ordinary skill in the art will also appreciate that 
any methods and materials similar or equivalent to those described herein can also be 
used to practice or test the invention. Further, all publications mentioned herem are 

incorporated by reference. 

[0608] It must be notedthat, as used herein and in the appended claims, the 
singular forms "a," "or," and "the" include plural referents unless the context clearly 
dictates otherwise. Thus, for example, reference to "a subject polypeptide" includes a 
plurality of such polypeptides and reference to "the agent" includes reference to one 
or more agents and equivalents thereof known to those skilled in the art, and so forth. 

[0609] Further, all numbers expressing quantities of ingredients, reaction 
conditions, % purity, polypeptide and polynucleotide lengths, and so forth, used in the 
specification and claims, are modified by the term "about," unless otherwise 
indicated. Accordingly, the numerical parameters set forth in the specification and 
claims are approximations that may vary depending upon the desired properties of the 
present invention. At the very least, and not as an attempt to limit the application of 
the doctrine of equivalents to the scope of the claims, each numerical parameter 
should at least be construed in light of the number of reported significant digits, 
applying ordinary rounding techniques. Nonetheless, the numerical values set forth in 
the specific examples are reported as precisely as possible. Any numerical value, 
however, inherently contains certain errors from the standard deviation of its 
experimental measurement. 

[0610] The publications discussed herein are provided solely for their 
disclosure prior to the filing date of the present application. Nothing herein is to be 
construed as an admission that the present invention is not entitled to antedate such 
publication by virtue of prior invention. Further, the dates of publication provided 
ma y be different from the actual publication dates which may need to be 
independently confirmed. 
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Example 1 Expression in E. coli 

[06 1 1 ] Sequences can be expressed in E. coli. Any one or more of the 
sequences according to SEQ ID NOS.: 1-104 can be expressed in£. coli by 
subcloning the entire coding region, or a selected portion thereof, into a prokaryotic 
expression vector. For example, the expression vector pQE 16 from the QIA 
expression prokaryotic protein expression system (Qiagen, Valencia, CA) can be 
used. The features of this vector that make it useful for protein expression include an 
efficient promoter (phageT5) to drive transcription, expression control provided by 
the lac operator system, which can be induced by addition of IPTG (isopropyl-beta-D- 
thiogalactopyranoside), and an encoded 6XHis tag coding sequence. The latter is a 
stretch of six histidine amino acid residues which can bind very tightly to a nickel 
atom. This vector can be used to express a recombinant protein with a 6XHis. tag 
fused to its carboxyl terminus, allowing rapid and efficient purification using Ni- 
coupled affinity columns. 

[061 2] The entire or the selected partial coding region can be amplified by 
PCR then ligated into digested pQE16 vector. The ligation product can be 
transformed by electroporation into electrocompetent E. coli cells (for example, strain 
Ml 5[pREP4] from Qiagen), and the transformed cells may be plated on ampicillin- 
containing plates. Colonies may then be screened for the correct insert in the proper 
orientation using a PCR reaction employing a gene-specific primer and a vector- 
specific primer. Also, positive clones can be sequenced to ensure correct orientation 
and sequence. To express the proteins, a colony containing a correct recombinant 
clone can be inoculated into L-Broth containing 100 ug/ml of ampicillin, and 25 
p.g/ml of kanamycin, and the culture allowed to grow overnight at 37 degrees C. The 
saturated culture may then be diluted 20-fold in the same medium and allowed to 
grow to an optical density of 0.5 at 600 nm. At this point, IPTG can be added to a 
final concentration of 1 mM to induce protein expression. After growing the culture 
for an additional 5 hours, the cells may be harvested by centrifugation at 3000 times g 
for 15 minutes. 

[06 1 3] The resultant pellet can be lysed with a mild, nonionic detergent in 20 
mM Tris HC1 (pH 7.5) (B PER.TM. Reagent from Pierce, Rockford, IL), or by 
sonication until the turbid cell suspension turns translucent. The resulting lysate can 
be further purified using a nickel-containing column (Ni-NTA spin column from 
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Qiagen) under non-denaturing conditions. Briefly, the lysate will be adjusted to 300 
mM NaCl and 10 mM imidazole, then centrifuged at 700 times g through the nickel 
spin column to allow the His-tagged recombinant protein to bind to the column. The 
column will be washed twice with wash buffer (for example, 50 mM NaH 2 P0 4 , pH 
8.0; 300 mM NaCl; 20 mM imidazole) and eluted with elution buffer (for example, 50 
mM NaH2 P04, pH 8.0; 300 mM NaCl; 250 mM imidazole). All the above 
procedures will be performed at 4 degrees C. The presence of a purified protein of 
the predicted.size cap be confirmed with SDS-PAGE. 
Example 2: Expression in Mammalian Cells 

[0614] The sequences encoding the proteins of Example 1 can be cloned into 
the pENTR vector (Invitrogen) by PCR and transferred to the mammalian expression 
vector pDEST12 .2 per manufacturer's instructions (Invitrogen). Introduction of the 
recombinant construct into the host cell can be effected by transection with Fugene 6 
(Roche) per manufacturer's instructions. The host cells containing one of 
polynucleotides of the invention can be used in conventional manners to produce the 
gene product encoded by the isolated fragment (in the case of an ORF). A number of 
types of cells can act as suitable host cells for expression of the proteins. Mammalian 
host cells include, for example, monkey COS cells, Chinese Hamster Ovary (CHO) 
cells, human kidney 293 cells, human epidermal A431 cells, human Colo205 cells, 
3T3 cells, CV-1 cells, other transformed primate cell lines, normal diploid cells, cell 
strains derived from in vitro culture of primary tissue, primary explants, HeLa cells, 
mouse L cells, BHK, HL-60, U937,. HaK or Jurkat cells. 

Example 3: Expression in Cell-Free Translation Systems 
[0615] Cell-free translation systems can also be employed to produce 
proteins using RNAs derived from the DNA constructs of the present invention. 
Appropriate cloning and expression vectors containing SP6 or T7 promoters for use 
with prokaryotic and eukaryotic hosts have been described (Sambrook et at, 1989). 
These DNA constructs can be used to produce proteins in a rabbit reticulocyte lysate 
system or in a wheat germ extract system. 

[0616] Specific expression systems of interest include plant, bacterial, yeast, 
insect cell and mammalian cell derived expression systems, Expression systems in 
plants include those described in U.S. Patent No. 6,096,546 and U.S. Patent No. 
6,127,145. Expression systems in bacteria include those described by Chang etal., 
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1978, Goeddel et al., 1979, Goeddel et al., 1980, EP 0 036,776, U.S. Patent No. 
4,551,433; DeBoer et al., 1983, and Siebenlist et aL, 1980. 

[06 1 7] Mammalian expression is further accomplished as described in 
Dijkema et al. 1985, Gorman et aL, 1982, Boshart et al., 1985, and U.S. Patent No. 
4,399,216. Other features of mammalian expression are facilitated as described in 
Ham and Wallace, Meth. Enz., 1979, Barnes and Sato, 1980, U.S. Patent Nos. 
4,767,704, 4,657,866, 4,927,762, 4,560,655, WO 90/103430, WO 87/00195, and U.S. 
RE 30,985. 

Example 4: Expression of the Secreted Factors in Yeast 

[06 1 8] Primers can be designed to amplify the secreted factors using PCR 
and cloned into pENTR/D-TOPO vectors (Invitrogen, Carlsbad, CA). The secreted 
factors in pENTR/D-TOPO can be cloned into the yeast expression vector pYES- 
DEST52 by Gateway LR reaction (Invitrogen, Carlsbad, CA). The resulting yeast 
expression vectors can be transformed into INVScl strain from Invitrogen to express 
the secreted factors according to the manufacturer's protocol (Invitrogen, Carlsbad 
CA). The expressed secreted factors will have a 6XHis tag at the C-tenninal. 
Expressed protein can be purified with ProBond™ resin (Invitrogen, Carlsbad, CA). 

[06 1 9] Expression systems in yeast include those described in Hinnen et al., 
1978, Ito et al., 1983, Kurtz et al., 1986, Kunze et al., 1985, Gleeson et al., 1986, 
Roggenkamp et al., 1986, Das et al., 1984, De Louvencourt et al., 1983, Van den Berg 
et aL, 1990, Kunze et al., 1985, Cregg et al. 1985, U.S. Patent No. 4,837,148, U.S. 
Patent No. 4,929,555, Beach and Nurse, 1981, Davidow et al., 1985, Gaillardin et al., 
1985, Ballance et al., 1983, Tilburn et al., 1983, Yelton et al., 1984, Kelly and Hynes, 
1985, EP 0 244,234, and WO 91/00357. 

Example 5: Expression of Secreted Factors in Baculovirus Expression 
System. 

[0620] The secreted factors in pENTR/D-TOPO can be cloned into 
Baculovirus expression vector pDESTIO by Gateway LR reaction (Invitrogen, 
Carlsbad, CA). The secreted factors can be expressed by the Bac-to-Bac expression 
system from Invitrogen (Carlsbad CA), briefly described as follows. The expression 
vectors containing the secreted factors are transformed into competent DHlOBac™ E. 
coli strain and selected for transposition. The resulting E coli contain recombinant 
bacmid that contains the secreted factor. High molecular weight DNA can be isolated 
from the E. coli containing the recombinant bacmid and then transfected into insect 
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cells with Cellfectin reagent. The expressed secreted factors will have a 6XHis tag at 
N-terminal. Expressed protein will be purified by ProBond™ resin (Invitrogen, 
Carlsbad, CA). 

[0621] Expression of heterologous genes in insects can be accomplished as 
described in-U.S. Patent No. 4,745,051; Doerfler et al., 1087; Friesen et al, 1986; EP 
0 127,839, EP 0 155,476, Vlak et al, 1988, Miller et al, 1988, Carbonell et al, 1988, 
Maeda et al, 1985, Lebacq-Verheyden et al, 1988, Smith et al, 1985, Miyajima et 

■ al- and Martin et al, 1988. Numerous baculoviral strains and variants and 

9 ■ * i 

corresponding permissive insect host cells from hosts have been previously described 
(Setlow et al., 1986, Luckow et al, 1988; Miller et al, 1986; Maeda et al, 1985). 
Example 6: Primer Design 

[0622] To design the forward primer for PCR amplification, the melting 
point of the first 20 to 24 bases of the primer can be calculated by counting total A 
and T residues, then multiplying by 2. To design the reverse primer for PCR 
amplification, the melting point of the first 20 to 24 bases of the reverse complement, 
with the sequences written from 5-prime to 3-prime can be calculated by counting the 
total G and C residues, then multiplying by 4. Both start and stop codons can be 
present in the final amplified clone. The length of the primers is such to obtain 
melting temperatures within 63 degrees C to 68 degrees C. Adding the bases "CACC" 
to the forward primer renders it compatible for cloning the PCR product with the 
TOPO pENTR/D (Invitrogen, CA). 

Example 7: Reverse Transcriptase Reaction 

[0623] cDNA can be prepared by the following method. Between 200 ng 
and 1 .0 pg mRNA is added to 2 pi DMSO and the volume adjusted to 1 1 pi with 
DEPC-treated water. One pi Oligo dT is added to the tube, and the mixture is heated 
at 70° C for 5 min., quickly chilled on ice for 2 min., and the mixture is collected at 
the bottom of the tube by brief centrifugation. The following 1 st strand components 
are then added to the mRNA mixture: 2 pi 10X Stratascript (Stratagene, CA) 1 st strand 
buffer, 1 pi 0.1 M DTT, 1 pi 10 mM dNTP mix (10 mM each of dG, dA, dT and 
dCTP), 1 pi RNAse inhibitor, 3 pi Stratascript RT (50 U/ pi). The contents are gently 

■ mixed and the mixture collected by brief centrifugation. The mixture is incubated in a 
42° C water bath for 1 hour, placed in a 70° C water bath for 1 5 min. to stop the 
reaction, transferred to ice for 2 min., and centrifuged briefly in a microfuge to collect 
the reaction product at the bottom of the reaction vessel. Two pi RNAse H is then 
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added to the tube, the contents are mixed well, incubated at 37° C in a water bath for 
20 min., and centrifuged briefly in a microfuge to collect the reaction product at the 
bottom of the reaction vessel. The reaction mixture can proceed directly to PCR or be 
stored at- 20° C. 

Example 8: Full Length PCR 

[0624] Full length PCR can be achieved by placing the products of the 
reaction described in Example 7, with primers diluted to 5jjM in water, into a reaction 
vessel and adding a reaction mixture composed of lx Taq buffer, 25 mM dNTP, 10 ng 
cDNA pool, TaqPlus (Stratagene, CA) (5u/ul), PfuTurbo (Stratagene, CA) (2.5u/ul), 
water. The contents of the reaction vessel are then mixed gently by inversion 5-6 
times, placed into a reservoir where 2fal Fi/Rj primers are added, the plate sealed and 
placed in the thermocycler. The PCR reaction is comprised of the following eight 
steps. Step 1 : 95° C for 3 min. Step 2: 94° C for 45 sec. Step 3: 0.5° C/sec to 56-60° 
C. Step 4: 56-60° C for 50 sec. Step 5: 72° C for 5 min. Step 6: Go to step 2, 
perform 35-40 cycles. Step 7: 72° C for 20 min. Step 8: 4° C. 

[0625] The products can then be separated on a standard 0.8 to 1 .0% agarose 
gel at 40 to 80 V, the bands of interest excised by cutting from the gel, and' stored at - 
20° C until extraction. The material in the bands of interest can be purified with 
QIAquick 96 PCR Purification Kit (Qiagen, CA) according to the manufacturer 
instructions. Cloning can be performed with the Topo Vector pENTR/D-TOPO 
vector (Invitrogen, CA) according to the manufacturer's instructions. 
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SEQUENCE LISTING 
[0627] A sequence listing transmittal sheet and a sequence listing in paper 
format accompanies this application. 
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CLAIMS 

1 • A first nucleic acid molecule comprising a polynucleotide sequence 
chosen from at least one polynucleotide sequence according to SEQ ID NOS.: 1-104. 

2. The nucleic acid molecule of claim 1 , wherein the nucleic acid 
molecule is a DNA or a RNA molecule. 

3. An animal injected with the nucleic acid molecule of claim 1. 

4. A double-stranded isolated nucleic acid molecule comprising the first 
nucleic acid molecule of claim 1 and its complement. 

5. The nucleic acid molecule of claim 4, wherein the first polynucleotide 
sequence encodes a polypeptide chosen from a polypeptide comprising a signal 
peptide, a mature polypeptide that lacks a signal peptide, a signal peptide, a 
biologically active fragment of a polypeptide, a polypeptide lacking a signal peptide 
cleavage site, a polypeptide consisting essentially of a N-terminal fragment that 
contains a Pfam domain, and a polypeptide consisting essentially of a C-terminal 
fragment that contains a Pfam domain. 

6. A second nucleic acid molecule comprising a second polynucleotide 
sequence that is at least about 70%, or about 80%, or about 90%, or about 95% 
homologous to the first nucleic acid molecule of claim 1 . 

7. A second isolated nucleic acid molecule comprising a second 
polynucleotide sequence that hybridizes to the first polynucleotide sequence of claim 
1 under high stringency conditions. 

8. The second isolated nucleic acid molecule of claim 6, wherein the 
second polynucleotide sequence is complementary to the first polynucleotide 
sequence. 

9. A vector comprising the nucleic acid molecule of claim 1 and a 
promoter that drives the expression of the nucleic acid molecule. 

10. The vector of claim 9, wherein the promoter is chosen from one or 
more of a promoter that is naturally contiguous to the nucleic acid molecule, a 
promoter that is not naturally contiguous to the nucleic acid molecule, an inducible 
promoter, a conditionally active promoter, a constitutive promoter, and a tissue 
specific promoter. 

11. A host cell transformed, transfected, transduced, or infected with the 
nucleic acid molecule of claim 1. 
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1 2. The host ceU of claim 1 1 , wherein the cell is chosen from one or more 
of a prokaryotic cell, a eucaryotic cell, a human cell, a mammalian cell, an insect cell, 
a fish cell, a plant cell; and a fungal cell. 

13. A nucleic acid composition comprising a pharmaceutically acceptable 
carrier or a buffer and one or more compositions chosen from the nucleic acid 
molecule of claim 1, the nucleic acid molecule of claim 4, the vector of claim 9, and 

the host cell I of claim 1 1 . 

14. One or more polypeptide molecules comprising a polypeptide 
sequence chosen from at least one amino acid sequence encoded by SEQ ID NOS.: 1- 
104. 

15. An animal injected with the polypeptide molecule of claim 14. 

16. The polypeptide of claim 14, wherein the polypeptide has a function 
chosen from an agonist, an antagonist, a ligand, and a receptor. 

1 7. The polypeptide of claim 14, wherein tire polypeptide is chosen from a 
polypeptide comprising a signal peptide, a mature polypeptide that lacks a signal 
peptide, a signal peptide, a biologically active fragment of a polypeptide, a 
polypeptide lacking a signal peptide cleavage site, a biologically active fragment 
consisting essentially of an N-terminal fragment containing a Pfam domain, and a C- 
terminal fragment containing a Pfam domain. 

1 8. A polypeptide composition comprising the polypeptide molecule of 
claim 14 and a pharmaceutically acceptable carrier or a buffer. 

19. A cell culture medium comprising the polypeptide of claim 14. 

20. The cell culture medium of claim 19, further comprising responder 
cells chosen from one or more T cells, B cells, NK cells, dendritic cells, macrophages, 
muscle cells, stem cells, epithelial skin cells, fat cells, blood cells, brain cells, bone 
marrow cells, endothelial cells, retinal cells, bone cells, kidney cells, pancreatic cells, 
liver cells, spleen cells, prostate cells, cervical cells, ovarian cells, breast cells, lung 
cells, liver cells, soft tissue cells, colorectal cells, cells of the gastrointestinal tract, and 
cancer cells. 

21. The cell culture medium of claim 20, wherein the responder cells 

proliferate in the medium. 

22. The cell culture medium of claim 20, wherein the responder cells are 

inhibited in the medium. 
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23. A cell culture comprising transfected cells, wherein the transfected 
cells are transfected with the polynucleotide of claim 1. 

24. The cell culture of claim 23, further comprising responder cells chosen 
from one or more T cells, B cells, NK cells, dendritic cells, macrophages, muscle 
cells, stem cells, epithelial skin cells, fat cells, blood cells, brain cells, bone marrow 
cells, endothelial cells, retinal cells, bone cells, kidney cells, pancreatic cells, liver 
cells, spleen cells, prostate cells, cervical cells, ovarian cells, breast cells, lung cells, 
liver cells, soft tissue cells, colorectal cells, cells of the gastrointestinal tract, and 
cancer cells. 

25. The cell culture of claim 23, wherein the responder cells proliferate in 
the cell culture. 

26. The cell culture of claim 23, wherein the responder cells are inhibited 
in the cell culture. 

27. A method of making a transformed, transfected, transduced, or infected 
host cell comprising: 

(a) providing a composition comprising the vector of claim 9, and 

(b) allowing a host cell to come into contact with the vector to form a 
transformed, transfected, transduced, or infected host cell. 

28. A method of making a polypeptide comprising: 

(a) providing a nucleic acid molecule that comprises a 
polynucleotide sequence encoding the polypeptide of claim 14; 

(b) introducing the nucleic acid molecule into an expression 
system; and 

(c) allowing the polypeptide to be produced. 

29. A method of making a polypeptide comprising: 

(a) providing a composition comprising the host cell of claim 1 1 ; 

(b) culturing the host cell to produce the polypeptide; and 

(c) allowing the polypeptide to be produced. 

30. A diagnostic kit comprising a polynucleotide molecule, wherein the 
polynucleotide molecule comprises a sequence chosen from (a) at least 6, (b) at least 
7, (c) at least 8, and (d) at least 9 contiguous nucleotides chosen from the nucleic acid 
molecule of claim 1. 
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31. A diagnostic kit comprising a polypeptide molecule, wherein the 
polypeptide molecule comprises an amino acid sequence or a biologically active 
fragment thereof, derived from the nucleic acid molecule of claim 1 . 

32. A genetically modified mouse comprising a deletion, substitution, or 
modification, of a sequence chosen from SEQ ID NOS.: 1-104, wherein the deletion, 
substitution or modification prevents or reduces expression of said sequence and 
results in a mouse deficient in or completely lacking one or more gene products of a 

...... j 

sequence chosen from SEQ ID NOS.: 1-104. 

33. A method of determining the presence of the nucleic acid molecule of 
claim 1 or its complement comprising: 

(a) providing a complement to the nucleic acid molecule or providing a 
complement to the complement of the nucleic acid molecule; 

(b) allowing the molecules to interact; and 

(c) determining whether interaction has occurred. 

34. A method of determining the presence of an antibody to the 
polypeptide of claim 14 in a sample, comprising: 

(a) providing the polypeptide; 
• (b) allowing the polypeptide to interact with any specific antibody in the 
sample; and 

(c) determining whether interaction has occurred. 

35. An antibody specifically recognizing, binding to, and/or modulating 
the biological activity. of at least one polypeptide encoded by a nucleic acid molecule 
of claim 1, or a biologically active fragment thereof. 

36. An antibody composition comprising the antibody of claim 35 and a 
pharmaceutically acceptable carrier. 

37. The antibody of claim 35, wherein the antibody is chosen from one or 
more of a monoclonal antibody, a polyclonal antibody, a single chain antibody, an 
antibody comprising a backbone of a molecule with an Ig domain, a targeting 
antibody, a neutralizing antibody, a stabilizing antibody, an enhancing antibody, an 
antibody agonist, an antibody antagonist, an antibody that promotes endocytosis of a 
target antigen, a cytotoxic antibody, an antibody that mediates ADCC, a human 

. antibody, a non-human primate antibody, a non-primate animal antibody, a rabbit 
antibody, a mouse antibody, a rat antibody, a sheep antibody, a goat antibody, a horse 
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antibody, a porcine antibody, a cow antibody, a chicken antibody, a humanized 
antibody, a primatized antibody, and a chimeric antibody. 

38. The antibody of claim 37, wherein the antibody is produced in a 
manner chosen from in vivo and in viti'o. 

39. The antibody of claim 37, wherein the antibody is produced in an 
organism chosen from a prokaryote and a eukaryote. 

40. The antibody of claim 39, wherein the organism is chosen from a 
bacterial cell, a fungal cell, a plant cell, an insect cell, and a mammalian cell. 

41. The antibody of claim 40, wherein the cell is chosen from a yeast cell, 
an Aspergillus cell, an SF9 cell, a High Five cell, a cereal plant cell, a tobacco cell ' 
and a tomato cell. 

42. The cytotoxic antibody of claim 37, further comprising one or more 
cytotoxic component chosen from a radioisotope, a microbial toxin, a plant toxin, and 
a chemical compound. 

43. The cytotoxic antibody of claim 42, wherein the chemical compound is 
chosen from doxorubicin and cisplatin. 

44. The antibody of claim 35, wherein the antibody has a function chosen 
from specifically inhibiting the binding of the polypeptide to a ligand, specifically 
inhibiting the binding of the polypeptide to a substrate, specifically inhibiting the 
binding of the polypeptide as a ligand, and specifically inhibiting the binding of the 
polypeptide as a substrate. 

45. A bacteriophage, wherein the antibody of claim 3 5, or a fragment 
thereof, is displayed on the bacteriophage. 

46. A bacterial cell comprising the bacteriophage of claim 45. 
A non-human animal injected with the antibody composition of claim 



47. 
36. 



48. A host cell that secretes the antibody of claim 3 5 . 

49. A method of making an antibody, comprising: 

(a) introducing a polypeptide, polynucleotide encoding the 
polypeptide, or a biologically active fragment thereof into an animal in sufficient 
amount to elicit generation of antibodies specific to the polypeptide, wherein the 
polypeptide: 

(i) is encoded by the nucleic acid molecule of claim 1; or 

(ii) comprises the polypeptide sequence of claim 14; and 



258 



PCT/US2003/027106 

WO 2005/005597 

(b) recovering the antibodies therefrom. 

50. The method of claim 49, further comprising after step (a), the step of 
isolating a spleen from the animal injected with the polypeptide or polynucleotide or a 
fragment thereof, and the step of recovering the antibodies from the spleen cells. 

51. •> The method of claim 50, further comprising the step of making a 
hybridoma using cells from the spleen and selecting a hybridoma that secretes the 
antibodies. ... 

52. The method of claim 50, further comprising making a polynucleotide 
library from the spleen cells, selecting a cDNA clone that produces the antibodies, 
and expressing the cDNA clone in an expression system to produce antibodies or 

fragments thereof. 

53. A method of modulating biological activity comprising: 

(a) providing the antibody of claim 35; and 

(b) contacting the antibody with a first human or a non-human host 
cell thereby modulating the activity of a first human or non-human animal host cell, 

or a second host cell. 

54. The method of claim 53, wherein the modulation of biological activity 
is chosen from enhancing cell activity directly, enhancing cell activity indirectly, 
inhibiting cell activity directly, and inhibiting cell activity indirectly. 

55. The method of claim 53, wherein the step of contacting the antibody 
with a first human or non-human host cells results in recruitment of the second host 
cell. 

56. The method of claim 53, wherein the first host cell is a cancer cell. 

57. The method of claim 53 , wherein the first or second host cell is chosen 
from a T cell, B cell, NK cell, dendritic cell, macrophage, muscle cell, stem cell, skin 
cell, fat cell, blood cell, brain cell, bone marrow cell, endothelial cell, retinal cell, 
bone cell, kidney cell, pancreatic cell, liver cell, spleen cell, prostate cell, cervical cell, 
ovarian cell, breast cell, lung cell, liver cell, soft tissue cell, colorectal cell, and 

gastrointestinal tract cell. 

58. A method of diagnosing a disease, disorder, syndrome, or condition 
chosen from cancer, proliferative, inflammatory, immune, metabolic, genetic, 
bacterial, and viral diseases, disorders, syndromes, or conditions in a patient, 
comprising: 

(a) providing the antibody of claim 35; 
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(b) allowing the antibody to contact a patient sample; and 

(c) detecting specific binding between the antibody and an antigen 
in the sample to determine whether the subject has cancer, a proliferative, 
inflammatory, immune, metabolic, genetic, bacterial, or viral disease, disorder, 
syndrome, or condition. 

59. A method of diagnosing a disease, disorder, syndrome, or condition 
chosen from cancer, proliferative, inflammatory, immune, bacterial, and viral 
diseases, disorders, syndromes, or conditions in a patient, comprising: 

(a) providing a polypeptide that specifically binds the antibody of 

claim 35; 

(b) allowing the polypeptide to contact a patient sample; and 

(c) detecting specific binding between the polypeptide and any 
interacting molecule in the sample to determine whether the subject has cancer, a 
proliferative, inflammatory, immune, bacterial, or viral disease, disorder, syndrome, 
or condition. 

60. A method of identifying an agent that modulates the biological activity 
of a polypeptide comprising: 

(a) providing a polypeptide or an active fragment thereof, wherein 
the polypeptide comprises at least one amino acid sequence encoded by SEQ ID 
NOS.: 1-104; 

(b) allowing at least one agent to contact the polypeptide; and 

(c) selecting an agent that binds the polypeptide or affects the 
biological activity of the polypeptide. 

61 . The method of claim 60, wherein the polypeptide is expressed on a cell 
surface. 

62. A modulator composition comprising a modulator and a 
pharmaceutically acceptable carrier, wherein the modulator is obtainable by the 
method of claim 60. 

63. The modulator composition of claim 62, wherein the modulator is an 
antibody. 

64. A method of treating a disease, disorder, syndrome, or condition in a 
subject, comprising administering the composition of any one of claims 13, 18, and 36 
to the subject. 
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65 The method of claim 64, wherein the composition is administered in a 
manner chosen from orally, parenterally, by implantation, by inhalation, intranasally, 
intravenously, intra-arterially, intracardiacally, subcutaneously, intraperitoneally, 
transdermally, intraventricularly, intracranially, and intrathecally. 

66. « The method of claim 64, wherein the disease, disorder, syndrome, or 
condition is chosen from cancer, a proliferative, inflammatory, immune, metabolic, 
genetic, bacterial, and viral disease, disorder, syndrome, or condition. 

67. The method of claim 64, wherein the disease is cancer. 

68. A method of treating a disease, disorder, syndrome, or condition 
chosen from cancer, proliferative, inflammatory, immune, metabolic, genetic, 
bacterial, and viral diseases, disorders, syndromes, or conditions in a subject, 
cpmprising: 

(a) providing an antibody composition that comprises a first 
antibody.or fragment thereof that specifically binds to a first epitope of a first 
polypeptide or a biologically active fragment thereof, wherein the first 
polypeptide: 

(i) is encoded by the nucleic acid molecule of claim 1; or 

(ii) comprises the polypeptide of claim 14; and 

(b) administering the antibody composition to the subject. 

69. The method of claim 68, wherein the antibody composition further 
comprises a second antibody that binds specifically to or interferes with the activity of 
a second epitope of the first polypeptide or to a first epitope of a second polypeptide. 

70. The method of claim 69, wherein the second polypeptide comprises the 
polypeptide of 14. 

71. A kit comprising the antibody of claim 35 and instructions for its use. 

72. A method of gene therapy, comprising: 

(a) providing a polynucleotide comprising a nucleic acid molecule 

encoding the antibody of claim 35; and 

(b) administering the polynucleotide to a subject. 

73 . A method for prophylactic or therapeutic treatment of a subject, 



comprising: 



(a) providing a vaccine; and 

(b) administering the vaccine to the subject; 
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wherein the vaccine comprises a polynucleotide or a polypeptide 
chosen from at least one sequence according to SEQ ID NOS.: 1-104 or a 
biologically active fragment thereof. 

74. The method of claim 73, wherein the vaccine is a cancer vaccine, and 
the polypeptide is a cancer antigen. 

75. A method of inhibiting transcription or translation of a first 
polynucleotide encoding a first polypeptide, comprising: 

(a) providing a second polynucleotide that hybridizes to fee first 
polynucleotide, wherein the first polynucleotide comprises a polynucleotide sequence 
chosen from: 

(i) at least one polynucleotide sequence according to SEQ ID 
NOS.: 1-104; 

(ii) a polynucleotide encoding a polypeptide comprising an amino 
acid sequence chosen from at least one amino acid sequence according to SEQ 
ID NOS.: 1-104; and 

(iii) a polynucleotide encoding a fragment of a polypeptide 
comprising an amino acid sequence chosen from at least one amino acid 
sequence according to SEQ ID NOS.: 1-104; and 

(b) allowing the first polynucleotide to contact the second 
polynucleotide. 

76. A method of treating a disease, disorder, syndrome or condition 
comprising administering a modulator to a subject, wherein the modulator binds to a 
cell surface molecule that is over-expressed in the disease, disorder, or condition, and 
is linked to the antibody of claim 35. 

77. The method of claim 76, wherein the antibody is capable of initiating 

ADCC. 

78. The method of claim 76, wherein the disease, disorder, syndrome or 
condition is cancer and the cell surface molecule is over-expressed in a cancer cell. 
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SEQUENCE LISTING 
<110> FIVEPRIME THERAPEUTICS, INC. 

<120> NOVEL MOUSE POLYPEPTIDES ENCODED .BY POLYNUCLEOTIDES AND 
METHODS OF THEIR USE 

<130> 08940.0012-00304 

<140> 
<141> 

<150> 60/485,217 
<151> 2003-07-08 

<150> 60/485,539 
<151> 2003-07-08 

<150> 60/476,621 
<151> 2003-06-09 

<150> 60/476,632 
<151> 2003-06-09 



I 

<160> 104 

<170> Patentln version 3.2 

<210> 1 

<211> 2145 

<212> DNA 

<213> Mus musculus 



aaacaggtga cacaggagta gatgttgtct tagtcagggt ttctattcct gcacaaacat 60 
catgaccaag aagcagttgg ggcggtaagg gtttattcag cttacatttc cacatttctg 120 
tttatcacca aaggaagtca ggactggaac tcaagcatgt caggaagcag gagctgatgc 
agaggccatg gagggatgtt ccttactggc ttgcctcccc tggcttgctc agcctgctct 
cttatagaac ccaagactac cagcccagag atggtcccac ccacaagggg cctttccccc 
ttgatcacta attgagaaaa tacctcacag ctggatctcg tggaggcatt tccacaactg 
aagctccttt ctctatggta actccagctt gtgtcaagtt gacacaaaac tagtcagtac 
agatgtcttc atggagaaga gagggtgagg attgtaacta tctggggaga gaagccgggt 
gggtgggata agatgtgcga tgatcttttg tgtaacattc tcacatactc ctcaatgact 
tatacatgtg attccaagcc cagtccacag aatggactaa gtcatttcct ttcctgggcc 



180 
240 
300 
360 
420 
480 
540 
600 
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agggttctga 


ccgtgtatgg 


agagtagagt agatttaaga ttgcccatct tccctggttc 


660 


tctgtggtca ggcttgctca ggactcagtg cacatgcctg tcttgggagg tgcgttagct 


720 


cttccctgat 


gccatttctc 


aagctgtaga gttttcccct gtccccttga acgagccact 


780 


gtcgtctggg 


taaggggact 


cccctgcctg cagcaaggca ggactattgt gttccctcct 


840 


tagttctgtc 


ttccatctgt 


taacagtggt gcctgctgct ctttatgttc atagtctgac 


900 


gagggaggct 


aatgacccac 


agtggggtcc cagggtcaaa ctgcctatgt ccttcttctg 


960 


ccgttgacca 


tgagacttgg 


gctcatggcc tctctctgct tccttgggtc agtgtgggta 


1020 


ggcacaggga aaggctctgg tgaggacaag gctttgctga ccattctgct gcttctctgg 


1080 


gtctctctgc 


cttgtctgct 


ttctcctcct tcttctgttt cccctgtttc ctcctcctct 


1140 


ggtaccttct 


tcccccaacc 


tggagagtgt gcctgggttt acagcgtgca tcccctgccc 


1200 


tcactcagga 


acaggtgtgt 


ggcctctgcg ggaactttga tggcatccag aacaatgact 


1260 


tcaccactag 


cagcctccag 


gtggaggaag accccgtcaa ctttgggaac tcctggaaag 


1320 


tgagctcaca 


gtgtgctgac 


acgagaaagc tgtcactaga tgtttcccct gccacttgcc 


1380 


acaacaacat 


catgaaacag 


acgatggtgg actcagcctg cagaatcctt accagtgacg 


1440 


tcttccaggg 


ctgcaacagg 


ctggtgagac tttgtgggta gagaggaggc aaaatgatcc 


1500 


aaaggcttgt 


gcctgccttg 


agtgtcagtg tccgatgtga gcaccaacta ggctggagtc 


1560 


ctgcaagggt 


tcaccacaat 


agctctttct tggcttctag atgagaagag catgcatagt 


1620 


ctttgaggga agcaatggct ggccaagcgc tttccttgta cgcaagaagt ctcttgctct 


1680 


gaagttctct 


ctgcattgcc 


tccaatcaga gccagggaat tcctttcctt gttactgttc 


1740 


tacgggtcca gacccagcca gaacctttcc aatcatcaac cacaagcaaa fccagcaaact 


1800 


gcctctttag tgcccagttc 


ccttctcatt caaactcgtt ttctggcatt aagtccagtg 


1860 


attttggagt 


ctgctttttt 


tttttttctg gcttcttttc attcctctgt ctcacacatt 


1920 


tcatagcata 


taacatcaaa 


caatagtgat gaattctcca cagttaagtg gacttctttg 


1980 


ggcatttgca tgcacgtgcg cacaagcatg tatgattgtg tctctacaca aatgcttatc 


2040 


cattgtgcac 


ctgtgctgtt 


tcccttaaac atgattgatg ggttgagtcc atgtatctca 


2100 


tttttttttt 


agctctccag 


agacttccag cccagaaaca gtcct 


2145 



2/186 



WO 2005/005597 



PCT/US2003/027106 



<210> 2 

<211> 2412 

<212> DNA 

<213> Mus mus cuius 



gag2tcatag atctttgtgt aaaactgagt atctccatgg gcatgtaatg tttaagtctg 60 
cctagttaac agtgacaaac tttatttttg cattttgggc aatcattgac ctctgacaga 12 0 
acttaaggga catatttatg agcttctggg aaaggtcttc aattctactc cccatcatga 
cctctacagg aacgttcctt ctagccagcc aagtagcaaa agaacaggac gacatctgag 
ctgtcccttc cctcctccgt gggccatacc gctccggggg cttgacgcca ggggtaacct 
gttctcatct gatgttctct ttagagaatg gcaatggtct ctgcaatgtc ctgggccctg 
tacttgtgga taagtgcttg tgcgatgctg ctctgccatg ggtcactcca acacaccttc 
cagcagcatc acctgcaccg gccagaagga gggacctgtg aagtgatcgc ggcccacagg 
tgttgtaaca agaaccgcat cgaggagcgg tcacaaacag tgaagtgttc ctgtttacct 
gggaaagtgg ctgggacaac aagaaaccga ccttcctgtg tggatgcctc catagtaatt 
gggaaatggt ggtgtgagat ggagccctgc ctagaaggag aagaatgtaa gacactccct 
gacaattctg gatggatgtg tgctacaggc aacaagatta agactacacg aattcaccca 
agaacctaac agaagcattt gttatataaa taggaaaaag aacaacctgt ggaatatacg 
ttgtgaggat ttaaaacatc ttccatagtt gcaagccaag tggatctctt atctgcactt 
tggttaccag ataaccacag tgcacttact ctgatacaca gtatcccaaa agaagaagac 
tcgggatttt ctggcaacat caaggaaaat ggcttttaaa aaaaaatgag ttttctctgt 
gaaatttgga ggatcatgaa gaacgatcaa ctgtcttcta atttggaact aacattactt 
tgtaccattt gaaatatata tgtatatata atattttgaa atattatata ttctcttcaa 
gaaatgaaca gtaccacaat gtgaggtggc tggtgtatcc ctttcagttt tggatgtttg 
gtcggttttg ttttgtttgc cattcctttt tctctcggta aggaagatac atgcccatgt 
gaaaatccaa catggcactc tccctggaag gccagctgca agccgactcc tggaagctga 
ggcatcctaa cagtactgag tcaagagctt ccccctgttt ctacctggtg acccaaggaa 
gctccttgtc ttgatttatt gctttctatc ctgtgcaata ttagcatgca agcttggctt 
acataatcat actttatatt cgattgatat ataataaccg ttctaacctc ttccaggaaa 



180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
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gatactccag ttctgaaggt ataaaagaaa gtgacatagg aatgtgcatt tttaatggct 2100 

gtaataactg actgtgtggc ccatgaaatg gaacttcaag agaagtggac tttttcttca 2160 

tgttcttccc aatttccaaa tgaagattca aagcctggat tcaaaggtct cttgactcta 2220 

ccggcctctc aatgctacat tcaatgtctt gattgataag aacaatgtgc atatgaccag 2280 

gaatgatagt aaggtaaaaa tgtgtttgta tttgcccatt cccaagttac tttatgatac 2340 

ctaaaggagg ccatgattga cagatcatca ttatcccttg aataaatgtt ctagcataga 2400 

gtacattgca tgtggcttta agatttgctt agacaaaagc agataccata catctataat 2460 

cctgtgtctt gttttattct aataatagtc tccaagtgct tctcaattca actcacctat 2520 

gttactcttc ccctgattgc agtgttcctg atggctgtta tctttcccga tttagtcact 2580 

tcctcttaat agacactaag agtatctttc agaagctcct gagagac 2627 

<210> 4 

<211> 3153 

<212> DNA 

<213> Mus musculus 

<220> 

<221> modif ied_base 

<222> (1410) (1410) 

<223> a, c, t, g, unknown or other 

<400> 4 

cccccatcct tgcctaaact ctagatatgg tctctacatg ttctatctcc cctttgttgg 60 

gtatttcagc ctatctcatc cccctttggt cctgggagcc tcttgctttc ctggcatctg 120 

ggaattgctg gtggctatat ctagttccca atcccccatt gctactaaat accactgttc 180 

aatttcctgg cactctgtat atcacccctg cctcctccca tacctgatcc catccccaat 240 

tctcccttcc cttcctcttt tcctccgaag tctctcccac actacttcac aagagtattt 300 

tgttccccct tctacacatt tcagtgttcg ttcttcttga cctacatatg gtctatgaat 360 

tgtaatttgg gtattccaag cttttgaaca aatatcaacc tatcaatcag tgcataccat 420 

gtgtgctgtt ttgagactgg gtcacttcac tcaggatatt ttctagttcc atccatttgc 480 

ctaagaattt tatgaagtca ttattttaat tagctgagaa gtattacatt gtggaaaggt 540 
actacatttt ctatatccat tcctctgttg aaggacacct gggttctttc cagcatctgg 



600 



atattataaa taaggctgct atgaacatag tagaacattt gttcttgtta tatgtctttt 660 
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gggtatatgc 
tctgagtaac 
gtagaggaag 
gatcttagcc 
cccaatgatt 
atttgagatt 
ttctttggaa 
ggtaaagatc 
ctaaataaaa 
tctatagttg 
ccattgtgtt 
atatggaggt 
tttgcatatg 
ttcattccat 
tgtttttttg 
ctcccaggag 
atacagataa 
tgattgatct 
gaatggagat 
ttctttactt 
tttgtgacta 
gagtagagga 
atctgctata 
cctgcacaga 
ggtggctaat 
ggcagccttg 
tgatgctggc 



cctgtagtag 
tgccagactg 
gttcttcctt 
attatgacta 
aaggatgatg 
tttgtttgtt 
tctaattggt 
ttttaccaat 
taaaataaat 
aacttacagc 
caaggctctt 
ccttgatcca 
tctacatggc 
tcatctacct 
tttttatcac 
ttctcttatt 
tttagaatgc 
ggtattgctt 
ctttctgtct 
acagatcttt 
ttttgaagga 
aggctactga 
ggaattctct 
attatatcat 
tgccctgggt 
tttaatccta 
tattggtttg 



tatagttgta 
atttccagag 
ctccacatcc 
ttgtaaggta 
aacatttctt 
tgttttagtt 
tgagttgttt 
ctgtaggttg 
aaataaaaat 
ctgagccatg 
tcccactttc 
ctgggactag; 
atgaccatan 
gtatgtcttt 
catcacactg 
ggtgggaata 
tcttctatat 
ttggtagatg 
tctgagagct 
cacttgcttg 
tttcatttcc 
ttgaaaaatt 
gatgcaattt 
gacttcttcc 
agaacttcca 
gattttagtg 
ttgtgtatgg 



tcttcaggtc 
tggttgtact 
tcaccagcat 
gaatctcagg 
taggtgcttc 
ctgtactcca 
ggatattagc 
ccattttatc 
aaacttttca 
ggtgtactat 
ttttctattt 
aggttagtca 
ggtgtgtggg 
gaaccaatat 
tagtgcagtt 
agaatagttt 
ctgtgagaat 
ccattgtttt 
ttttcaattt 
gttagagtca 
gtaatttctt 



ttatatccag 



ttggagtcac 
tttctgatat 
gtacaatact 
tcattgcttc 
taagtatgtt 



caattttctg 
aggttgcaat 
cttctatgtc 
gttgttttga 
tcagccactt 
tttataatat 
ctgctattgg 
ctgttgaccg 
attttatgag 
tcaggaaatt 
gattcagtgt 
atgagatagg 
tttaattccg 
catgaggttt 
tgtggtcaag 
tcatatcctg 
gagttgcatt 
atgtaatgct 
ctttctttag 
taccaagata 
cctcagacta 
ccactttgct 
ttaagtacaa 
gtgtcatttt 
gcatgtgtag 
aagtttcact 
taggtatggg 



aattcaatta 
cctaccaaca 
cagaattttc 
tttgcatttt 
gagactcctc 
ggttatttgg 
atataggttt 
tgttctttgc 
atcccatttg 
ttcacatgtg 
gtctggtttt 
aattgatcag 
ggttttcaat 
ttttgttttt 
agtgtttatt 
cgtttctgct 
tgatggggat 
gcaatcatga 
agacttgaac 
tttcatatta 
tttatccttc 
gaagttgttt 
tatcatataa 
gaactcattt 
ggagagagta 
ccattttgtt 
cctttaatta 



720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
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ctgatatttc 


caagactfctt: 




gttgttgtat 


tttgtcaaat 


gatttttcag 


2340 


catctaaaga 


tfcfcgaccatg 




tctttgagtt 


cctttatata 


gtagattata 


2400 


ttgatacatt 


tccatat-af-h 


/~T ^ 4— 4 — s~% 

yddtCdLCCC 


tacatccctg agataaagtc tacttgatca 


2460 


tQQtcjaatcra 


tccrttfctaafc 

*»%*5J V- W V->53 CL L- 


y tyt tec t-gg 


atttggtttg 


tgagaatttt 


attgaggttt 


2520 


ttattgcatg 


aatafctrafa 


dyCaaaaU Uy 


gtctgaagtt 


ttcctgcttt 


gttcagttgt 


2580 


tatcrfccFtfctt 


f~ rrni" nfra rrr* 
*— yy LaLLa^L< 


ataac cgcac 


cttcataaaa 


cgaatggggt 


ggtgtccctt 


2640 




t.wuy cidciu 


aatttgaata 


gtattggtat 


taggtcttct 


ttgaaggtct 


2700 


53 «■ w vj.^— ^ CI CL \^ ^ 


a <"» 2a 4~ o ^ =a 


taatctgetc 


ggatttttgt tgttgtttgg agatgtttaa 


2760 




tdtLtCLLta 


gggtfctatga 


gactatttag atgatttatc 


ttactctgat 


2820 




tayctgycau 


ctguatagaa 


attgtccatt 


ttatccatat 


tttcaagttt 


2880 


tcttgagtat 


agtcttttgt 


agtaggatct 


catgattttt 


taaattttct 


ctgtgcccta 


2940 


tagttagttt 


gactaaaggt 


ttatctattt 


tgttcatttt 


ctcaaaaaca 


acaacaacaa 


3000 


caacaacaaa 


aaacagctcc 


tagttgtgtt 


aattctttat 


ataattctct 


gtaggcttat 


3060 


tgattgtgtt 


ctgctccttg 


gggaagaacg 


atacatgagt 


tgaccatget 


tacaaaagat 


3120 


tgtctctggc 


tgggtgtggg 


aaatgectta 


agg 






3153 


<210> 5 

<211> 2900 

<212> DNA 

<213> Mus musculus 












<400> 5 
gagtatcaaa 


ggcatgaacc 


accauacacc 


gcctccaaac 


ctgettgaac 


atgaagcetc 


60 


agtccgtcgt 


ccaagaagtg 


ciu u LCycaua 


cacaacaaac 


acgtcaacgc 


ccccacctca 


120 


gcacacagag 


gaaaatgeca 


aacaggctac 


ttacgaggag 


tcttcgtttg 


ccttcgaaag 


180 


accccaacag gtcggccact 


gacttcttgg 


gacttggagc gagatgtttt gcggtgggct 


240 


tctgattgtc 


atcctgctct 


gtcttacccg 


cctttttctt 


cttattcttg tctttctcct 


300 


gtttggtctt 


tttctcagct 


ttcttcgcct 


tgcttttcac 


tcttgagtaa 


cttgtcttgt 


360 


tttgeactet 


tccctttttt 


cttcttctcc 


ttctcaggtt 


tctcgagctt 


ttcaagcttt 


420 


ttcgactcct 


tggcattctg 


cgagggaacg 


ctatccacct 


gccgctcctc 


ctccacctga 


480 
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gaggaggtgg 
ttctttttgg 
tgatggccat 
cttgcgctga 
gctttgctgg 

gggtatagct 

ggcctgtggg 
taggcagtgg 
gctgtcgtga 
gtggctgcct 
ctccttgggt 
ccctcgtttc 
aaacccttct 
atggcttcca 
agcaccatgc 
ttggggtcca 
ccttcctcgc 
caatacacgt 
ggggctgaga 
ctagatccag 

s 

atctcacgcg 
atacctgagg 
ctcctttgga 
ctcccggtgt 
ttccgtccag 
agccacatgg 
acttggggac 



gctggctgag 
gcagcttccc 
gttcccgttt 
cctcccagag 
gcctcctggt 
gctctgagac 
agactgagaa 
tggttgtagg 
ctggagcagg 
tgctggtggc 
cctctttcct 
cttcctggac 
gcctgatttt 
gcctgaccgg 
taaacttgcc 
gaggctgctc 
ctgcctggtg 
cgtccttcag 
tgacccatac 
aggggaatcg 
ccgagctcct 

ggggtggtct 

aaggctgagg 
gcctcagata 
tgtggctgtc 
ccaagagcgt 
gcagaggggg 



gtcatacttg 
ctttactggc 
gtctgtccgg 
ggtgggaggt 
ggcctgtggc 
ggaggggccc 

gggatgcaac 
tcttgcggct 
aggaagggtg 
ctgctttctc 
cttatcaccg 
tacatggccc 
ctcgatcctg 
gtaagggtag 
cttctccagt 
caggatctgc 
gaagagcaca 
gaggctcatc 
cctgttcttt 
caacattctt 
agctatgggt 
catggtcggg 
atttggttct 
ccgagctgac 
gctatccaag 
ggtgaagtgg 
tataatacaa 



tcttcatatt 
ttgtgaggac 
ttgtcccgga 

ggggtggctg 

ttctctctct 
ctggcagtgg 
cgggatgtcc 
attgtcacta 
gtggcccttg 
ggagcctctc 
cccaggcctg 
tctatgcccg 
cggatgggac 
cgctcctcca 
ttcaagaagc 
ccttcattgg 
atctgttgga 
atgaggcggt 
cctgcaaagc 
gtccgagcag 
ctcacctcag 
tgagctaatc 
tcctgggtgg 
ctactgctga 
gcaggagaat 
ggtcccattt 
aaatgaaaat 



cgttgctgag 
ctggcaccgc 
aacggcttgt 
tgaagcttcc 
gttgctcctt 
tcacctctgc 
agggcctctg 
cccgggatgt 
gagttggggg 
tggtggggtg 
tacctcctgc 
aggccttaca 
cttgatcaat 
cctgcagagt 
tcatcagctt 
tgatcctccg 
tgtgcctttc 
agtagccctc 
tggccaggat 
aagacccctc 
gtctgactgg 
tcaacactgg 
atttctcaac 
ttggagaaac 
gagatgctga 
tccacatcat 
aacataaaat 



aattctgtcc 
atttgggtcc 
gcccacaact 
atagttggtg 
ccgaggtaga 
tgttgcaggg 
ggtagctgga 
ggcccgagtg 
aggctgagga 
gacttgtgtt 
tcctccaccg 
cttttggaca 
gacctcatac 
cttttttaac 
tgggatgaga 
taccttgccc 
tgccagctca 
tgaggcgtga 
attgggagag 
atctcggacc 
aactccattg 
tacactcfctc 
tccaccagac 
caaaggtact 
tccacacacc 
tgtattatcc 
gagaagggag 



540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
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aaaaagaaaa gaaaaaaagt 


cattaattgc aagcagagct 


gggcatgatc 


tctttctcta 


2160 


acttggccaa 


aaagaaaggg 


caggagatag ttaaagaaga 


aaaaggaggg 


caagcagagg 


2220 


agggcaaaca 


gagagagggc 


ttctcttaaa ctgctgcgat 


tgtcagtgct 


tagcagagtc 


2280 


agtctttaag caggaaatcc 


agctttgatt tacaggcggc 


aggcatgagg 


cacgcctgct 


2340 


ctgcctcagc 


tgtgcccaga 


tgcaccagag actgtgtaaa 


ctgagcatgc 


taattatgaa 


2400 


ttctaactgt gaatgtgcat 


tcacacttca tgtcttcaaa 


caagcaacaa 


ggaaacaaac 


2460 


accagagagc 


aaggttggca 


cacttacaca gtggttcatg 


gagatgcttg cagaggcttt 


2520 


ccagagaaac 


gtgaaggttg 


gtccccagga cagtttccta 


ggtgagatcg gttccttcct 


2580 


gctctctgtc 


tccttgtctg 


tctgcaaagc agccccaggg 


agcccagcct 


gggccgattc 


2640 


cttttgctgg 


ctgtgtcttt 


taccttctgg tccagtggga 


ggaggtgaac 


gctgctgtcc 


07nn 
z / uu 


ttgggaccgt 


ttgtatctcc 


atttgtctgc agtttctcct 


gggtctgtgt 


gtgccggtcc 


2760 


ttggttgcca 


cagggatctt 


cctgaaagtg aagtcctagt 


gatttctgaa 


tctcgctctt 


2820 


cccttctgtg 


gagttttccc 


agaacttctt atcacccttc 


ccctcctgca gtctggccga 


2880 


gtccctttgt 


ttccgagcgt 








2900 



<210> 6 

<211> 1852 

<212> DNA 

<213> Mus mus cuius 

<400> 6 



atcttctttg 


atttctttct 


taagagactt gaagttctta 


tcatacaaat 


ctttcacttc 


60 


cttagttaga gtcacaccaa 


ggtattttat attatttatg 


acctttttta 


tttttttgtt 


120 


tttcgtgaca 


gggtttctct 


gtatagctct ggctgtcctg 


gaactcactt 


tttagaccag 


180 


gctggcctca 


cactcagaaa 


tccgcctcct ctgcctctcg 


agtgctgaga 


ttaaaggtgt 


240 


gcgctatcac 


acccggctct 


ctttatgact attttgaaga 


gtgttgtttc 


cctaattttt 


300 


ttctcagcct gtttatcctt 


tgtgtagaga aaggccactg 


atttgtttga 


gttaatttta 


360 


tatccaacta 


cttcactgaa 


gttgtttagg agttctctga 


tagaattttg gggtgactta 


420 


aatatactat 


catatcatct 


gcaaatagtg atattttgac 


ttcttccttt 


ccgatttgta 


480 


ttcctttgat 


ctccttttgt 


tgtctaattg ctctagctag 


gacttcaagt 


actatattga 


540 


ataggtaggg 


aaagagtggg 


cagccttgtt tagtccctga 


ttttagaggg gttgcttcaa 


600 
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gtttctctcc 
ttaggtatgg 
atttgtcaaa 
tgtttatcta 
gggatgaaac 
tcgagaattt 
tctttctttg 
ggtagagtac 
tttttgaagg 
tgggagacta 
aatctgatcc 
caggttttcc 
ttcctcagat 
gtccctgtgc 
agaatcagct 
tttcagccct 
tttgttctag 
ttttgaaggc 
agtttggata 
tataaccagt 

ggggggcatt 



atttagtctg 
gccttgaatt 
ttctttctca 
ggtggattat 
ccacttgatc 
tattgagtat 
ttgggtgtgg 
cttctgtttc 
tctgatagag 
ttaatgactg 
tgattttact 
agttttgttg 
tctgttgtta 
cctctagtta 
tctggtatgg 
gagtttgatt 
agcttttagg 
actcagagct 
tgttgtggcc 
cttccaaaca 
cacactaata 



atgttggcta 
cctgatcttt 
gcgtcccaag 
gttgatagat 
atgatagatg 
ttttgtattg 
tttagatatc 
tattttatgg 
ctctgcacta 
tttctatttc 
ttggcacctg 
agtataggct 
tgtctccttt 
gtctggctaa 
ttgattcttt 
atttggtgcc 
tttgctgtca 
atgattctaa 
ccttctcatt 
aaggaaagat 
tgtaggctta 



ctggtttgct 
cccagacttt 
agatgatcat 
ttctgtaaat 
atcgttttga 
atattcataa 
agagtagttg 
tatagtttga 
aacccatcag 
tttaggggat 
gtatctgtct 
ttcgtagtag 
ttcatttctg 
gggtttatct 
gaatagttct 
ttctactcct 
agctcctcgt 
cttttaggga 
aaactctaaa 
aaggaatatc 
agtgatttgc 



gtatatggct 
tatcatcaac 
gtggtttttg 
tgaaccatcc 
cgtgttcttg 
gggaaattgg 
tgacttcata 
ggagtattag 
gttctgggct 
atgggactgt 
agaaacttgt 
gatctgatga 
attttgttaa 
atcttgttga 
ttttgtttct 
cttgggtgag 
gtatgctctc 
ctgcttcatt 
aagtctttaa 
tgactgtgct 
ttctttaaaa 



tttattatga 
ggaggttgga 
tctttgagtt 
ccgcatccct 
gatttggttt 
tctgaagttc 
gaacaaactg 
aattaggtcc 
tttttttggt 
ttagatcatt 
ccatttcatc 
ttttttggat 
ttagtatact 
tttttctcaa 
atttggttga 
tttgcttcct 
tccagattct 
gtgtttcata 
cttctctctt 
cttctttcca 

gg 



660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1852 



<210> 7 

<211> 2417 

<212> DNA 

<213> Mus mus cuius 

ggtgaaggtt gccctaggat gtctaagaac atggctaaac aagccatggg gaggaagcca 
ctgagcaacg gtccttcaga gccattgctc tgttcctgcc tccaggttcc tgctctggct 



60 
120 
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tcttccttca gtgatgcccc gtgaactgta agatgaaacg aaccctttcc ttccctgatt 180 

gcggttgatc atcgtgttct ttgcagcaac agaaggcaaa tgaggacacg gaccattagg 240 

tctaactagc cccttctacg tgtatttgga ggcccgaccc ctagacaatt acagtaccca 300 

tcacattggt gacactggcc accatggcta ggcatgacag ggaagcttca ggactcccct 360 

ttgccagggt gttctgagtc tggaacaagc ttctccttaa ctcaaaggga gttcacgctg 420 

ctgagttagc ctggagacat ttggggtctg tggacagtga agatggatgg tgtctacagg 480 

atggtggtgg tggtggtgac agcaactgac atgagctggg tgtgggactc aggcattact 540 

tggatcagct cattcagatc accacagctc tagcaagaca gtggcaacat tgccctcttc 600 

tttagatggg agactgtgtt gctaaaactc tcatcagact gttgatggta gcatccggct 660 

ttgctcacct ctgcaagtga ccatgaccag gggatgagat ctctctactg acaagcaagc 720 

aagtcacggg cacttgtggc atacaatatg gcccctgttc ttctgttact cttcttatta 780 

gataaaaagt gtgaatgttt gtgtgtacat gttcatatgc atgtgtgtga gcatgtatgt 840 

atgtgtgagc atgtatgtgt gtgtgagcat gcatgtatgt gtgagcatgt atgtgtgtgt 900 

gagtgtgtgt tactcttctt attaaaataa gaatgtaaat aaaatgtgtg tacatgttca 960 

tatgtatgtg ttgtgagcat gagcatgtgt gtgtgtgtgt gtgtgtgtgt gtgagagaga 1020 

gagagagaga gagagagaga gagagagaga gagagcattt gtacaggtga aatgtcaagt 1080 

gtcttccttg atctcctccc atcttgtttt ttgagatgaa atctctcact gacctggagc 1140 

tcctagactg tgtagactga caaaccaacg atgctcacgc cattgcctcc ttagcaggga 1200 

atcatggatg tgcaagacca cacatgtcct ttttcatgga ttctggggac tcagtcccag 1260 

gttctatatt tgtgcagcaa acattttatc agctatgcca tctccccgga cttcgttgtc 1320 

atttattgta atgagtgatg tgagctaatg gtgtgccctg attcagatct cccgggttaa 1380 

cgtgtaacgt ggaccactgt tcaaggcata tggggcagaa gttctgatcc attttaggtt 1440 

ttccagagtg gcaggagagt taggtaaaag ggtcagcaat gacaaatgta gagtgccaag 1500 

tgcttcatct acatgcagac aaagatatct gttcccccaa gacacctggg ttgtcactga 1560 

gaaaatatcc acaaggcaaa acaatgccac gtgggcaaac ttacctacct tcagccttgc 1620 

tattggagcc attcactggc tgctcaccag aaacacagga acggagttct taccctgtgg 1680 

ggtactggga ccagggataa gattcctttt atttgtctgg acatgaattt aagctacaga 1740 
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acctcacaac 
gtgaccactg 
acagtgtcgg 
accctggggc 
tccagcacct 
agtagtgagt 
ttaaagaaca 
agacagaatg 
ccgtatcgac 
cattagtgag 
tcattttcga 
atgagatggg 



ccattacaca 
agaggttcca 
agcctccccc 
ctggcccacc 
ctgctctgga 
gactctcatt 
tgtgctagag 
acaggagaga 
tggagtccat 
cttcccaaat 
tttcctttct 
acatcgg 



ttgaagaact 
ggacccaaga 
tatgtcctat 
ctttggccac 
gcggccgtgg 
ccacggcagg 
catgaagtag 
ctaggcagag 
tatgacaacc 
acaagaaaat 
gattatacgg 



cactatatcc 
aaaccacgag 
gctcccgctg 
cgtgagggca 
aaggaggagg 
aattccaggg 
cttatcacaa 
gtcacagggg 
tccgagtgaa 
tgatccgtga 
aaccattaaa 



agccaccagc 
tgtgtgagcc 
gatacatcag 
caaatgagac 
tttgttgttg 
atacaaacac 
ccatttaccg 
gtacatttcg 
aaggtggaca 
ccaattaatc 
atgaacccaa 



catctatcag 
tcctccctcc 
cacatgcccc 
ttggagtcat 
atgaatcaat 
acatctcaga 
tctccagcgc 
ccatcaaaag 
ggctccggag 
ctggcagctt 
atagaaagga 



1800 

1860 

1920 

1980 

2040 

2100 

2160 

2220 

2280 

2340 

2400 

2417 



<210> 8 

<211> 1298 

<212> DNA 

<213> Mus musculus 

<400> 8 

gactgtggcc aggtgcagcc 
gaaatgctga tgggctctcc 
atgtggcctg cagagagccg 
tgctgccgcc actactgccg 
acccagctcc gttgcatctc 
caaaagcaga gtccgggtcg 
gagcagagtg tggtcttcag 
actcgtggcc ctggcaatct 
gaactcactc cccccagcat 
taggacattt cattcagtaa 
aggactctta cacaaggcaa 
tttctgcaca tctcatctga 



aagccactct 
tattcggggg 
gcactatctc 
ccactagagc 
ccgctccctg 
cccctgttct 
cggctggaag 
cagaagagct 
cacccctgtg 
gacatggtgg 
agatcaagat 
catcatctca 



tgccggcgct 
aatgcggtag 
tactgctgct 
tgccgggagt 
gccccgcacg 
gcagccagag 
ctgctccctg 
tcactgaatt 
cagcctggac 
tgaaccaagg 
gctctttcac 
cctcttagaa 



cgtatcctta 
tgatgtataa 
gctgctgcca 
gagcctccct 
gacctccccg 
gatgatgtgt 
ccctttctcc 
cccttccgac 
acactgactg 
ctctaaatct 
tcagtgttgg 
tgagatggca 



gttaggtggc 
ggtatgaaag 
ccgctgcctt 
tcttccagat 
ccccagggaa 
cgagagggct 
atgagtcctg 
cagaacctgg 
aaccaacttt 
tcatgattga 
atctccactg 
tttgctccct 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
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aaagagtttg atggctcaag catggtggat taggcctttt ctgctgggat gctatgactc 780 

catttacatg tatgtgtctc tatcactatc catggccctc tgaaatgggg aaatacatct 840 

tttctattgt tttcctgatt gttaacacag tacccaacac atagtgaagt gtaggatatc 900 

tgataaacaa atctatcaat agatttggca aacggggtag ccctgtaata ttctatggaa 960 

agagtaaaat atatgcaaat gtggacagat ggtgtctcct ctctcaattt aaaatgacaa 1020 

agtgtattat tagtaaagca caaagaggaa aaagactgta cgtgggagtg ctgtaaagaa 1080 

actggctgtt gaagagatgc aaagacttca atagatttat tcatgcgctt acagataggc 1140 

agctttttag tccatggaca tttcatcatc ctgtacaatt gctattgaca tgccaatgag 1200 

ttccacattt ttctgtccat ccttactgaa ctcagggctc ctcttgcatt ctcaaacatc 1260 

ttctggatat attctggcag attaagtttc aaaaatcc 12 s 8 

<210> 9 

<211> 4319 

<212> DNA 

<213> Mus mus cuius 

<400> 9 

acacagtttc gaagaagccg gcggtgcccg caaaggctgg agtagcgaga acgactccca 60 

gtctagcggc tgcgcgctcc tcaacctgca gaccgaaagc tcgggaggac ccggcctagc 12 0 

cccgcagaac tggtgggtgc tggaccagag catctaggtg ctctaagccg gagactccag 180 

cttagccaga gcccctgctc tcctcgggcg caacaacttt gacgatctat cgcggacaga 240 

gctcaaaacc agacaacacc taagctggcc actccctcga agagcccgat ttggagagtc 300 

tagatgccaa gctcagaagc agcgaggacc tcaggccaga aagttgaacc cgggccactg 3 60 

ctgtgaagca gcttcctctt agccactcca ggggtctggt ctctctccat tctgaacaac 420 

cgccatattg ttcatcagac ggaacaagtc tgagaccagc agatgaacag gcagaggcct 480 

gaagacccca ggaacagctc tgccttcctt gttccttgat ccctctctgt caatcatcat 540 

atcctacagt gacatcccgc cccaccgaaa gctgcctgga gacaagaccg atggagaggc 600 

ccaccagcaa ctggagcgca ggcagctggg tgcttgcact gtgcctcgcc tggctgtgga 660 

cgtgtccggc ctctgcttcc ttgcagcctc caacatccgc agtcctagtg aagcagggca 720 

cctgcgaggt gattgctgca catcgctgct gcaaccggaa ccgcattgag gagcgctccc 780 
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agacggtcaa atgctcctgc ctgtccggcc aggtggctgg caccaccaga gcaaagccct 
cctgcgtgga cgcctccatc gtcctgcaga agtggtggtg tcagatggag ccctgcctgc 
tgggagagga gtgtaaggtg ctcccggacc tgtcagggtg gagctgcagc agtggacaca 
aggtcaaaac caccaaggtc acacggtaac tctcggaggt catggcttag gtaggacagc 
cttgactgag ctcgggactg aagaaaggcc tggtcaccag acagcagata agaggactta 
cctggacatg tgcccatgtc aagtgtaaca tagaccggcc agggcccctg ggcagcactc 
tgttcagcta aaatgcttgg atctttggcc acacacttga gagacctgtg ctctcctatg 
aacaaagtca acacacaaac ctatccttaa ggaatatctt gtcaagttaa ggagtggatg 
acaggcaatt tcaacagttt aaagtgtttg gtaccgtggc ttctgagcga cctgcagtgg 
gtgtggtggg gtggggcgga tggaatttac acagatcttc agccacccga tcccaggaca 
caaagttatc, agggccacac cagtatcttc ccatcttggc cttcccatca aacctgcatc 
cccagcaaga tgctaagatg tgagaggaaa acactgcccc ctggtgccaa gccagtggga 
tctcttttgg acagcactgg aaagtagaga tgaagggctt cttccgcaca cctctgcaga 
ggcaggggcg gggcgcacat cccatgctgt ccatggatgg acaggacctc agtcagaatc 
acccctccac cttgaacttc gggtgtttgg ttgctgtttt aaaggacgag gaaaccgagt 
cacagaggat gagatggatt tgcccaggat ctcacagctt ggccttccaa aaacacggat 
gttaggactc cagcctaggc cttccgtggc agttttatct agacagtgag accccgcaaa 
cactgtctgg atagcaccaa catgattctt gggagatggg actttgacca ctggaatttt 
ggttgaggcc actgatctcc catcaaacaa acaaacaaac aaaccctaga atatacacac 
acacacacac acacagcaga gtggagctcc ttctcagtcc cccaaggcca aacaggcccc 
gtgacagccc agaggtcgct cagcccagcg gcagcagaag cagtgtgagc agttaggtat 
cccatcaccc actctgcttc tctatcagga gcttcagcca gccccctaca taggtacctg 
ccaggaggca gaggggcagg ccagaggact tcataaccag gctggcctcc tagatggctt 
gggagtagca agctggctta cttttattac caaggcctta agtgttagag tgagtgttag 
agaccagcct ggatgatggt tccctgagaa ggtaaatcag ccatagttta cactggaatc 
ttggtgccct.tctccttgcc tgtgtcccaa tgcatttgac taagactctg gtatcacccc 
acctcagggc atgttttggg tttggcaaaa tgagatgaac agtaacgtta tctttcaaag 



840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
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gcaggcctct 


gtcagtctct 


gcttgtagca 


cggtgacagg 


ccagggtcac 


caagggtcaa 


2460 


tggcagtcat 


ggtctccttc 


agtactgagg 


agtggaggcc 


agagggtgtg 


ttaagtctcc 


2520 


gtgacatatt 


ctttgttttg 


ttgttttgct 


tctttttgtc 


tttaaaaaca 


ttatagactt 


2580 


gccaatgcca 


tctttagctc 


tagggaggga 


aggaacacat 


tgttgcccat 


gtcaggtctg 


2640 


tgacccttcc 


caggggacgt 


cgaaggcctg 


ggatggcctc 


ccagcctctt 


ggctcaccct 


2700 


ctgtaccatg 


gaaacctgag 


cggcagggct 


ggggctgggt 


ttgttttcct 


cctgatgctc 


2760 


agagagagat 


gtgtgagtga 


attatttagg 


ttacggcttc 


agcagagcca 


gacagtgaaa 


2820 


gcctgctttc 


ctggaggcaa 


aaggtagttc 


tgtcgggggg 


tgtggagcgc 


catagactct 


2880 


ctgctcactc 


tgtcctggac 


actgtcaccc 


aacggtcggc 


tcccctgcct 


tgccccacct 


2940 


gtgttgtcca 


cacgaactgg 


gttagagtgg 


agagttcaga 


gatgacgcgt 


cacaagcctc 


3000 


cgagagcggc 


cagtctcctc 


cagctccccc 


ccaccccccg 


ctgcctcggg 


gtcctgattc 


3060 


tgtaccttct 


ctgaccccca 


ctaaagttgg 


ctaattcctt 


gttaagtctc 


cctccctgcc 


3120 


tcctgaccta 


gccatgcatt 


tagtgggtga 


actctgagct 


gagtgggtat 


ctgtagattt 


3180 


ccagggaagc 


cccacacaaa 


aagctctggc 


aaagaaaaga 


tccagacccc 


cacctccaac 


3240 


tccacccccc 


cccaaaattt 


gtatccagct 


gaatccaaac 


ctgtgtaggt 


tttgtagcta 


3300 


tgcaaagaac 


agtctagaaa 


taaatggagg 


gttttttggt 


ttggttttgg 


gggggggttt 


3360 


agagcatagg 


ggttcctcca 


gtcatttgac 


ttcctctgcc 


ctccttgacg 


agaacatctc 


3420 


atcatctgag 


gtttgctgga 


accccatctc 


ttcagctatg 


cgggctctag 


gaaagccagc 


3480 


aactactaaa 


tcatgtgact 


gatttctgct 


aacctcctgg 


cagttgtact 


gagtctgtga 


3540 


gtgacatcct 


gattaatgat 


cccagacagc 


tccacaacat 


tgacttgcac 


attgagacga 


3600 


cagctcccta 


aacccagaat 


tatagttgca 


agagggttta 


gaagccaact 


ctttgatttg 


3660 


ccagcctggg 


agagggtagc 


tcagaaaggt 


taagtaattt 


gcccgaagtc 


acacagcaag 


3720 


ctggtggcat 


gtccttgctc 


agtgactcag 


catagagttc 


tttccactat 


tgtacatccc 


3780 


atfcgtacccc 


agagcagatg 


agaacttgga 


aggagaacag 


ggttcttaac 


tgtgagagtc 


3840 


tcattctgga 


atcgcccaac 


actgtgagag 


caagggccag 


agggaagaag 


agcaggaaac 


3900 


acatagtgat 


cacatacaca 


tagggaagca 


agaaggggag 


gggtggatgc 


ccagaatttc 


3960 


aaactacaga 


gtcaaaccga 


acaggacaaa 


tgccacacga 


aggagaactg 


gtgcctgtcc 


4020 



16/186 



WO 2005/005597 



PCT/US2003/027106 



tttcctgagg aggcaacatg gagtgttgtc agtccttggc aaactgggtg tgggggtgca 4080 

tgcctatatc ccaggtactc aggaagcaat aaggaggatg gcaagatgga ggccaggctg 4140 

gacctaacag agaccctgtc aaaagaggaa caacaataac aacagaaaaa gaatggaaac 4200 

ccttgcagaa ccttgatgct gtcttgatca ccccaggttc tttttgtcct atgcagagga 4260 
ttgcaaacta aagcacatag gttacttttt ttttaacaaa taaagtttta ttgaaacat 



4319 



<210> 10 

<211> 3423 

<212> DNA 

<213> Mus rausculus 



ggc^agactg aaacccccag aggctggccc ctaatggccc aagtctgccc tacttctcaa 
aaataggcta catcccacag cctgtcaaaa tagtgccacc tactggggac cacatgttca 
aacctaggcg cctgtgagga gtatttttgc ctttaatggg taacaattct caacttgttc 
ccacaccaca aacatagctt atgtgagatt tggtgagcac ctgagcacag ttccagcctg 
ctagttgatc acttaaatgc agcagccaaa gtccaagagc acacactgtc tcttgaggcc 
agaaccccag cctcttggag catcaccccc cggtgccatg tttcttcccc ccacccccac 
cccgaatatt tcttcagcct ttaatgttct ctcgtgttct cctcagctat taaatcctgc 
aggtagataa tattcatctc tacttatcag gactctgtaa gtctatctca tgactacact 
ttaaaaaaaa aagatttatt tattttagta tgtgagtaca ttgtacctgt cttcagacat 
accagaagag ggtatcggat cccattacag atggttgtga gccaccatgt ggttgctggg 
aattgaactc aggacctctg gagagcagtc agtgctctta accactgagc catctctcca 
gcccccttat gactacacct taactgagtc tttcattcac tcactaaatc tcttttccct 
tatggcccac acggttgttt tttccccagg tatctgtgta aattttactt aagttctaga 
cagctttgtt gttttgcatg gtgtttataa ctggcatctt caatctagat ccagttgttt 
gtactgaaca aatattttga ctttcaacac aatgtcttac gttgttgcta tgataaatgt 
ttagtgttct gcataatatg cccaatagtg gagagatctg atgagagagg agagaagaaa 
gaggctcatt tcaatgatct gctgttgatg ttttgaaatg tcctagcagg acttggtgtt 
ctgttcttcg actgagggaa aagcagttgt tagcagtcac ctgtcattct cctggcacaa 
atcttgagcc caagtaatga agcaattctt cacagggttg aggcccacag gaggctgcat 



180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
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agtagtctca gcagacagcc agcttcattc aggaggcccc tgacatctgt gtcctccctg 1200 

atgaccacct gtgcacactt tctcagagga agaagagaga tcatccttta aagccaaatg 1260 

tgaaacgagt aaatgattaa gtccatgcag tccctaaggg cccattcact ccgagtccac 1320 

attgaatgaa gcttcaagac agatatggaa atggcagtct cataaccttc tggctctaca 1380 

aggcatcact cattgtagtg ggccagagca gcattgcccc atagcaagac acaggaggat 1440 

attcccacgt gtctccatat tgtaaacaga gtaatgttat aattttttgt cagaaataaa 1500 

gactgacttc ctttaggcag gtttgtctgt ctgctgcgtg tagtgtgtgt gtgtgtatac 1560 

acaaagatga aaagaggtat tccaagtccc ctggagctgg agttacaggt gtacagagct 1620 

gcctgaggtg ggtgggtgct gggacccaga ctctggtcct ttggcagatt aggaagctct 1680 

ctaaacaaac atctgagtta tctctccatc cccataagca ggctttggaa agcagaacta 1740 

taccaggcta catgtaattg ttttacataa tctgctgcta tactgcccta tgtaaatata 1800 

actatttata gtgcccaccc actctgggaa ttatcacttt ggcttacttt gctatggatg 1860 

acagacgtgt aatagaatcg aactttttta cttgaatttt tttatatttc tggtgaggca 192 0 

gagttttgcc tgtgataact gatggtctaa cctgaattgc atgctttcat tgtgcttgga 1980 

tgaattcttt atatatttgg aaaatccagt ttctatagaa tcttattgct gcaagcttat 2040 

aaaatgtcag atagggtgtt tagcattttt tgtgtgttat ctaatttaat tatcacagta 2100 

tctctgtcac agggcattct gatatggtgt ttaagtattg aattgggtgg ttcttggttt 2160 

aagttctgat accactgttt cctcttgtga ccattggtaa actttatata tcctgcaagt 2220 

cttcctttgt gttatatttt tattttgtgt ctgtctgtct gtaccacctc actgtctata 2280 

tgtgaatgta tgtatatata tatatatata tatatatata tatatacaca catatatgtg 2340 

tgtgtgttat gtatgtatgt atgtgtacgc atatgtgtat gtacatatgt ttatgtgttc 2400 

atctctctat gtctgtgtgt gtgtctgtgt gtatatttgt gtatctgtgt ctgtatgtct 2460 

ctctctctct ctctctctct ctctctctct ctctctctct ctctgtgagt gcatgtgtgt 2520 

agaagtaaga ggaagatgca ggcatcattc ctctctgggt tctagggttt gaactcaggt 258 0 

caccagggtt ggcaagcatc tttacactcc gagccatctt gcctgcccag tttcctctcc 2640 

tatgtattgg aaataactgt ccctgtctta catgcttgta aggattgaga gagtgcatat 2700 

gtgcaccatc atgtacacag cacacactca tttaatatcc attgagacag tgggtgtttt 2760 
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ctcttttttc cctgggttaa tttaggttga gaaccaccag tctagctggt ataatggaaa 2820 
agacccatgg gctacttttg aagtaaaggg ggatttattt aggcttagaa ttttggagtc 2880 
aaacctaagg ctaggcagcc cattacttta ggccttgcac gccatggcag gagcacatgg 
gcaagtaagt gctcacttct ccagccagga agcagcgtga gagactgact cagggtccac 
agtctcatgg gcatagctca aggtctcagc ttcgaagatc agcagcacct acctaccaga 
cctgccaccc tgggtatcga attctgacct ctgagggggc acccagctct aaagaaaacc 
atctagctga ctctatcaaa gctgctgttt ttatagaagt tggtaaccaa ggcttaaaag 
tatttgaatt atttccagtt ctgtgccttg tgctcaccaa aggctacagt gttgccaaca 
gtcaagcagc tctgtgtggt tggtgccagg gacccatgca tggggtaggg ctcagaacct 
gcagtcctta tctgggtact cagccagcct ggagcagatg aagctgaggc atgtcaatca 
actaggagag aatttccagt gggaaaatag ggatgcccag cttgtgctca ggctttgggt 



2940 
3000 
3060 
3120 
3180 
3240 
3300 
3360 
3420 
3423 



ttt 



<210> 11 

<211> 3340 

<212>. DNA 

<213> Mus musculus 



^cttccgt ccaccccttt ccaagtcctc agtgcctcgg tctctcaccc tctctgcatc 
caagaccccc aggctcggtt ccgggcgcca cttcctcctc ttcaggtcag atctgttccc 
ctggacccag gttctctggt aggtgacctg gaaggccttt gtccccgggg gcggggccgg 
caccggcaca ggacacctgc aatgtcaccg tcttccacga gggtgcgcgc tagctagcac 
cacttcttag agctgagaga cctttcaaat cctgcttaag agtcccgggc tttctctaca 
gccttatcag ataagagtca ttaacaggag gtggagctga aggagaaaag aagccctagt 
gaaaagaaag aatgcataaa gtagccaaca ctgctgggag ttctagagat taagggaaaa 
ctttcactac ctgcttcctg acttgaattt cataaattac ggtattgttt attttattac 
tgtttttgtg ttttgttttt tagagacagg gtttctttgt gtagctctgg ttgtcatgaa 
acttgctctg tagaatagct taacctcaaa ctcacagaga tccgcctgcc tctgcctccc 
aagtgctggg attaaaggcg tgcaccacca ccgcctggca cttattttac taatttttaa 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
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aatttgtatt 


tttataggct 


agagagataa 


ctgggtgctt 


caaagctgcc 


gttgaggaga 


720 


acatgggttc 


gagtatcagc 


tcacagctgt 


ttgtaattcg 


tttcagtgaa 


cctgatgctc 


780 


tctctggtct 


ccatggactc 


cagacacaca 


tgtgatgctc 


aggcaggcac 


acatacaggc 


840 


aaaacatcaa 


aacgcataaa 


ataagttttt 


tttctaaatt 


ttgtaaatat 


gtgtgtgggg 


900 


attacatgta 


ggg9aaggca 


tgtgcccaca 


gaagccagaa 


gagggtgatc 


tatcccttga 


960 


agctgaaatc 


acagatggtt 


gtgagctgcc 


tggcatgggt 


ggtgggaacc 


aaactcaggt 


X020 


cttctgcaag 


atcaatcgtc 


actctgaact 


tctgaaccat 


ccttctggtc 


ccttttttat 


1080 


tttttttatt 


aaagatttat 


ttattgtata 


tgagtacact 


gtagctgtct 


tcagacacag 


1140 


cagaagagga 


catcggatct 


cattacagat 


ggttgtgagc 


caccatatgg 


ttgctgggat 


1200 


ttgaactcag 


gacctctggg 


agaacagttg 


gtgctcttaa 


cgttgagcca 


tctctccagc 


1260 


ctcccctttt 


ttattttcaa 


ccttagatgc 


catatctttt 


gagctgtgca 


cctcgttttg 


1320 


ttgtcattgt 


tgtacatatt 


ttacaacttt 


gagagactat 


ctgtgtgtat 


acatacatgt 


1380 


gtgcaagtgc 


acaccaccat 


gtaggtgtgc 


ctttgtgcag 


actagaggtc 


agcactgggt 


1440 


gtcttcctcg 


gttgtgctcc 


atcttcttct 


ctgagtcagg 


gcctcggatg 


aaggccgaat 


1500 


cttgttgatt 


ctgctttggt 


ggctggccgg 


ccggcctgcg 


agctcagaga 


gcctcctgtc 


1560 


tccacctcca 


cagggctggc 


attatagttc 


ccgtgttgga 


catggaaccc 


aggccttcgt 


1620 


gctcatgtgg 


caggcctgag 


ccatctccct 


gacccccgtg 


tactcccatg 


tcacaaggag 


1680 


acgtacacaa 


aacccgcctg 


tctccagaaa 


cccaagttca 


gttctgccca 


ctaggtgctg 


1740 


agaaagataa 


cgttgagaag 


gggatttagg 


gatttctaat 


tgccaagctg 


tgctgcttct 


1800 


accttctacg 


cacagtgagg 


aggagatcgg 


aaaggagaga 


tttgggctca 


gaggggcatg 


1860 


caggggtgtg 


gagctggatt 


cagacttctt 


a tggttttgg 


tgttttcctc 


tatgaatcat 


1920 


ctctgccatt 


ggcactgaag 


agcatgcgcc 


tcacacactt 


atggtcccat 


gacctctagg 


1980 


aattacttgt 


agggattaca 


tggaatctgt 


gatgccgcta 


ccattttaat 


tattgcaatg 


2040 


tataagcaca 


gaagcatgtg 


tgtatgtatg 


tgtgtgtaca 


tgtctgtgtg 


cctgtgtgct 


2100 


tgtctgtgtc 


catgtgtgct 


catatgtgtg 


tgtccatgtg 


cacatgagtg 


tgtgtgtgtg 


2160 


tgtgtgtacg 


agcatgtgtg 


catgtgctct 


tgtgcacatg 


tgtgcacatt 


tttccatttc 


2220 


tttttaaaat 


gtttttaatt 


taattttttt 


ttttttatgt 


tatgaaatcc 


gcctgcttct 


2280 
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gcctccgaag tgctgggatc aaaggcgtgc gccaccaccc ctcagagagt cttccatttc 
tttaatcagt ttttcagaat ctggtctgag gatgtagctc agttggggtg cttgcctaac 
acgtttgagt gctgtattca aacctaggca tcgcgtaaac acagtgtggt ggagaacacc 
tgtaattcca ggaggcagaa ggatcagaag tccttgagca gccttggtta atagggagtt 
caatgccggc ctgggcaggc atctctgaac caaaataaac ttgaattttc aggtacaggt 
tttccctctc tgagtcctta gaactgtgtg tgtgtgtgtg tgtgtgtgtg tgtgtgtgtg 
tgtgtgtcgt aaagggcatc gctttctgat ttgtcttttt taatatttga tgtctcctct 
ctctgtgttt ttcaagactg ggtttctctg tgaagccctg gatatcctga actcactctg 
tagaccagac tggcttggta ctcagagatc tgagtgtctc tgcctgtctt ccaagtgcta 
ggactaaagg cttttgccac cactgcctgg ctagaaatat ttctcaagag ctagagaggc 
agctcattag ttaagagcgc ttgctcttgc agaagagcag tgttgggttc ccagctcctc 
gaccaggtag ttcacatagt ccaggaactc catctccagg ggatccaacg ccttcttctg 
gcctcagtgg gcacttgcag tttccttgta cacatattac taaataaaaa gaaatcttta 
gaaaaataaa ataaacaaca cagctgagta acttaaatta ttcacatggc tcccggcttg 
tggctgccca ttctgcctca ctaaggtcgc atctttggag ttaaaatcag tatagagctg 
atgttctcct taatcttcag acaccacagt cacctgtagc tccagttcca gggttttgat 
gccctcttct agtctctaag ggcactacac actcgtagtg aacatacata catacatact 
tgcaggcaaa acattcgtac tcattaaatt ttttttaaag 



2340 

2400 

2460 

2520 

2580 

2640 

2700 

2760 

2820 

2880 

2940 

3000 

3060 

3120 

3180 

3240 

3300 

3340 



<210> 12 

<211> 3933 

<212> DNA 

<213> Mus musculus 



<400> 12 
gtctctcccc 

tggcaaagct 

ggcatctgag 

gcatccgggc 

gggctggagt 

gcctggtagg 



tccggctctc 
tgccttacct 
actcagctgt 
gtgctcctca 
ggcttgtgtg 
gtctgtctga 



aggcatacag 
gtaccaagat 
atgactagag 
cactggtgtt 
cctcggtggg 
gcggtgaggg 



cttcccgggc 
caccagcctc 
gacatttcct 
ggcgaggcat 
tttgtgagct 
tgtgtgaggc 



caaaggcaga 
tccgttctct 
cttagccttt 
ggggcttcct 
gggtggaccc 
atggtgacta 



actctgccct 
tcaggaaccc 
ctttccacag 
cacagtggtg 
tgaccccaga 
gcccgtgccc 



60 
120 
180 
240 
300 
360 
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cacttctacc 


cctacctgct ctgccctcat cgtccctggc 


actgctggtg 


ggccctaccc 


420 


tgtccactgt gtctctccat gagtcctaag tcctctgtga gcggcgtggt ccgtgtgcac 


480 


ctgtgtgcac gtgtgtgtgc gaggggcaca ggagtcctgt cttgtctctg tgctgtgtgg 


540 


gcagatggag gttggcctgt ttttactctc tctgtgtttc 


tccttgtctt 


tttttattcc 


600 


ctctcatcct 


catcgcactc tgccatcaac ccaaactctc 


atctctcaga 


tcagcgagaa 


660 


99ttggtcgt 


tttcacttct tatccatcta cagttcgccc 


accgactgtg 


cccgtcagtt 


720 


gggaccagcg 


ggcttggtcg gctccctcag gctcgctcca 


gccgcgcctg 


cccgccagtt 


780 


ggcctcttct 


cggtgtttgg gctggctggc aggcaggacc 


agggatgggt 


gggcaggcct 


840 


tctgccctgc 


atgtacctgt tgcttgaccc tgattcgctg 


tgtgtctgca 


tgtcccctta 


900 


gccaggctcc 


gctgttggag tggccacaca tgcacggtgc 


acggcataaa 


actgtaccct 


960 


acagaactgc 


actcagcttt gcacccagac caggctagcc 


ctctgtttca 


ggaatggtca 


1020 


tggggatctc 


tgtcagagaa actaaactac atgcttggca 


tgtaggagtt 


cagggaagct 


1080 


tctaccattt 


ccccacccac ccttatgttg tctgcggaac 


taagagaagg 


cagaactgca 


1140 


ggaggcaata 


ggaggaggct aatttagagt gagttgtccc 


cttgagatct 


cacagtacgc 


1200 


gacacagtcc 


tgcagtgtct gccctgccct tgaggccccc 


agagccttac 


caatgctcca 


1260 


caaagcagac 


tggctgtact gagacttaag accagccttg gtctgcctct 


caaagaccct 


1320 


gatggccgta gtccaaggca ggtggcatca ttttcataca 


ctgagtaagt 


gtcccgacat 


1380 


tatatctttt 


ctgtctccta ccgggtccaa accaaggtca 


ccctgcctgc 


ccagaccact 


1440 


ctacccgagc 


ttccaagcca gcctctgtgc ctctccgagg gtacagtgta 


tgtggctcac 


1500 


tgaccagcta 


acagccctca gtcccatgct gggcagtggc 


actgagcgtt 


cagatccagt 


1560 


ttgcttatgc 


ctggagtgac tgaattgtca ctgcagtctc 


cttaggcctt 


gagtatcaat 


1620 


cagtccagct 


cagccttaga ttgggcaggg ctttttgaaa 


ctccctggat 


cctctcccag 


1680 


agctgtcccc 


tggagccagg cctctatttc ccagtacagg 


accaacaagg 


ggctggtggt 


1740 


cctttatata 


tcgtgtggtg taaagggtta atcactgggc 


tttggggggt 


ccaaggaaga 


1800 


agcacagggc 


agaccccagc taccaggtgg cctctgcatc 


atgctctgtg 


gctttgagtt 


1860 


tactccggga 


ggacaaggga gccagtcagg aggagggctg 


cagcctgggc 


tcagatatca 


1920 


acataagcct 


ccatctgtat gcactcctga gctctggggc 


atttcggtta 


caccatttcg 


1980 
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ctgaccaggc ggatctgcca gggcgttcag aaaatagtca tcagcagaca gcccagcccg 
cagccactcc ccccacaccc cttcacctac cctctgcatc ctttgttctc tccccaccca 
ccctgtggtc ccaccaaaaa tacacacata cacaaatatt taacgggaag gagaaatgag 
tttctaaata ttccgggggg atttgccggg ctggtaattt gtttggtgct gataattgca 
tcttacattt ttctagcttt acttaaaact gtgtgccggg tgtgccagca tttaattact 
gctctgcggc agccacaagg ttattstatta aagagttatt ttatcttgat acagtggatc 
ctgcccctta tcccctcctt aatgttgtgt tattttcatc agagaaattt ccgcaagtga 
atgcagattg cggggctacc tccccctgcg attaggctgt cactccactg aatgcggata 
cctccaggcg ccgcaccacc ctattatagg gggttgtcag cgcaaataat tagacttaaa 
agttacagct gaaatataat ccagaaatgg cagggccctg ttttggaaat tgtctataaa 
atgtcagcag taaggatgca cggggaacag taatagaccg gcattgttgg agcctgagat 
tagaccctaa gtgcattttc cccagctcca gtttttcctt ccctgctgtt gcttctgtca 
atagaccaag tccagggaga gtcctgttcc ctttggaggc cctgtgcttg tggcggccgg 
gggaggacgg tggagatgct atgttggagc atcagccact tgcaactgtt ggcacaggag 
tagctgtcta ggctgggcta ggacacaggg cctctagcat ttggggcact ttgttgcctc 
tttccccatt ttcagtagat atgctggcct acctgagctc tagcagattg acttccagga 
gtctttgcat gtggctggac cccagtgccc ttctatagga atgtagttat cagtggaata 
cctggggcct ctggctgagt tcctttccca gtgctgaggt gaccctgacc tcacacctgt 
ctgtagcccc caaagttctc cctatagctt gcatcttgga gggaaaataa aagccattct 
tagccgggcg tggtggcaca cgcctttaat cccagcactc aggaggcaaa ggcaggggga 
tttctgagtt cgaggccagc ctggtctaca aagtgagttc caggacagcc agggctatac 
agagaaaccc tgtctcgaaa aaccaaaaaa aaagaaaaag ccattcttag tgtcacttcc 
catgggggcc cagtttcctg acatcgacta gagatggagc ctttgaaagg agcggcccag 
ggcattggcc tggctccgaa gctgatccct gaggactctg ctggcttgga aaacactgtc 
caccatggac tatccagggg tcaaagtttg gacatgttta ggtatgggcc ccaggtattt 
tagccaaaga cctacttcct actgcataaa cccagtggcc cctctttcat ttgggttcca 
ccattaacta gagccaccat taactagtgt: cactctcaaa agtcttatct gtatgccatc 



2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3060 
3120 
3180 
3240 
3300 
3360 
3420 
3480 
3540 
3600 
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tggagctega 


cattatgcca 


acgctaaagc 


cccatggcct tcatgggctt tcgaagatag 


3660 


agtttttacc 


acacagacct 


crafc c t fc c c fccr 


a yy aLai - aclcl ygcagd.tgcy tcaccccccg 


3720 


tgtcagcatt gactctggcc 


ctcatctgca 


attgagacat ggtggcacag gcaccctctg 


3780 


tacagagtac 


agagtggctt 


tccactaggc 


atgatgtgca tggctttaat cccagcactc 


3840 


agaaggcaga 


ggcaggtaga 


cctttgtgag 


,ttcaaggcca ccctggtgta tacactaagc 


3900 


ttcaggatag 


ccagggatat 


ataaaccctg 


tec 


3933 


<210> 13 

<211> 2272 

<212> DNA 

<213> Mus musculus 








<400> 13 
cratacratcrat 


acrafcacra haa 


atagatagat 


agatagatag attagataga tataaagtgg 


60 


aaat agaag t 


tccaaahcra 


ctgtgaggag 


ccagggatgg cactgttcca tgtaagtgtt 


12 0 


ccctggagac 


attfcaa tct"c 


tacagttgtc 


tggcctggcc ctcttagact gcactgtcca 


180 


ataaaactat 


tgtagaactg 


tttaaatcca 


atttgaaatt teaaatagee acataaaaat 


240 


gtaaaagaag 


ataaagtcag 


aggtagagtg 


caagtttagc atttataata ctctgggttc 


' 300 


tatcaccagc 


acagactatc 


agatgataaa 


tgatagatga tagatgatag atagatagat 


360 


agatagatag 


atagatagat 


agatagatag 


atagatattc attccttttt agtaagttca 


420 


aatagtcttt 


gaacatgtaa 


tccacataaa 


gctaattaca aaataaatat attacatgga 


480 


tttttatatc 


atgtctttaa 


agttttatct 


gttgtgcata ttttaattca gatcagtcat 


540 


aagtagctgc 


catattggac 


agtggagcct 


tacaagactg acccttcccg ctgccttcca 


600 


aaacatgttc 


acatttagag 


gatggactct 


gccaacactg cctaagcccc ttcattttac 


660 


aggtgaagat 


gaccagatct 


agagatgctg 


ggttgcttag aatcccatag ccattcacat 


720 


cagaagcccc 


acccttttgt 


ctgacttcaa 


ttgetttcat cattctttct ttccttagat 


780 


ttgtgttgct 


gcaactactc 


actgtgagac 


tgttgctgtt ggttaaatat tgagccagga 


840 


gtcttaaaaa 


catttggtcc 


actcagtaaa 


gccactgttt gatttggtgg aaactgtttt 


900 


gagcagagta 


ggtagaacta 


ccaggcacat 


tgacagttct ccatgagata gtaataggag 


960 


gtgataatgg 


ggacaggggt 


gactttatgc 


ttgttggtgt tgctgtgggc attataggag 


1020 
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tatgtgactt caaattttga aatgctagct gcctgcttac aggaagggag atgagctgtt 
cccatgcaaa gtttattgtc tagttgtatg tataagccta cacatgatga atatcttaat 
gcttaactct ctctggaaat ggcagggtta agcttttata tttcatattt ttttcttgta 
ttaggaacgt gtgtgtgtgt gtgtgtgtgt gtgtgtgtgt gtgtgtgtgt gtgtgtgtat 
gtgtgtgtgt gtgaatgtac tcctatttct cactcagtta tacttaaagt gtgatgattt 
ttgacaagtt aatcccagta cactcctggc atgtaaagta ttttcagaat gtaagcccat 
gtggaaattc gttggcttgg tatgacgggt tcgggcaggg gaggcatagt gtgatcccgt 
ggagtgttcg gagctgccaa gtatttgttc ttcggtgtga aagcgtgctt gacgggtgga 
ctttcctagc tggaaggtgc tttgttcatc cttcctaaaa catttatcag cacggatgtt 
gtagatcgtg taagagagtc ccctttattc atgtgactta gttcagtgat cttagattta 
gagttaagtt tttgtgttcc agggttggga ctaattccca gccatggccc agctttaccc 
acaaaggaac taccgacctg taaggatatt catttgatgc gtttgctttt gttcctggtg 
ccagtcattg gattttgcct atagtgcttt atgtggctat acttagaagt tttatagctt 
gtagcaatca ggtgaaagta cagggatatg gctgagtact gtaacttgcg tctgtaatcc 
cagccagagg caggaggatc accagtccag ggcaatttga gctacagagg caaactgaaa 
aaggaaaaga aaaaaaaaaa gagattcctt tatttccttt ggccttcata cacaaatatt 
taaattctta atgtacggtt ttaagtcagc cccctacctc ccccaccttg gtagtttgca 
tagtacacat tagcatttga aacaaaagtt ttgagctata agatgctcat gggagtcatt 
ctgaaatatg tttaaggtgg attgattcgg aaaggatgca aatccaagtt aggctggctg 
aggcctcccc gcttctttag agggtgatac ggaaggtttg aggacccagt ctgcttcggt 
cggagtgtgg acagccagtg agggcagtgc agcaaaaccg tgtctcagaa tc 



1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2272 



<210> 14 

<211> 1554 

<212> DNA 

<213> Mu s musculus 

; 4 tg?atg 4 tac acacatatgt gtatatgtgt gtgtatgtgt gtatatatat gtacatatat 
atgtgtgtgt atgtgtatat atgtatacat acacacacaa acacatatat atatatcaca 
gaaagttatg gaccatttta aaatatcact atttgtcatt gatttagttt tacacaatac 



60 
120 
180 
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tattctccag taacgtttct tctttttcca tttctttcat ttcttctctc tcttctttta 240 

ctaaattggc acacttgttt acttctcagg ttttgcaaac atccttaata tgttggtatg 300 

gcaaatgcat ttcaattcaa atagcaactg caagttgata tgcagaagaa cagaagtaat 360 

gtggtctgga attaatttgc tcaattgctt acaagcaata tgctatgtat gaaaagatta 420 

ccaaattaca ggtattaaat tctaaatgat ttaacagttt attatgattt attctattta 480 

acaacctagg gagtttgaag ccattggaaa ctccagtggc aagttcaggt gcagttggga 540 

gttcaggaat ctcaatggag ctattagtgt agcagacttg tgggtaaaat atatgcatct 600 

ctcaatagca aatgtgaaag actctctaaa atggacctca gtgacttcat tctccacaag 660 

ctgtcctggc ccactcagca tatcctttat tgttccatga tattaaggca tgtctgcttt 720 

tttttttttt taataataaa tccatgacca cctgaaggtt ttaacttggt atgcacggca 780 

ttaaggcttt cacagccacc ataggtattt tctcctccat ccattatgtt attataaact 840 

attactttga tgccatagga actttcttag tttttcctac acattctggg acactggcag 900 

aaataacttc ttccagatta tagttgcttt gtgcaacagg aaggagatca gtcaacaatg 960 

gatttataaa ctctcagata acacagaaag ttacatctca tgtccaaaaa tgaccttatt 1020 

acttggccct aatttagttt tgcatgaatt tttaaagcag gaatcataaa tggcccaagt 1080 

gtcttaattc aaattcttac ctgactgtgg agaagaaatc attgtgattt ttgaacacac 1140 

ataattatgg tttaaaaaca aactaacctg aatttgtttg aatatagttt ttcaactttc 1200 

tctgatacta ataaactcat cagccagttg gaaattgctt tccagaaaaa attctcattg 1260 

cttatatgtt catatatttg tacatgtata aaagatttca aaaacttgga ataagtttag 1320 

accattctta ataaggatat taatggatat ttaaagtgcc tgtcttttca gtgtttggaa 1380 

atttatttag tgtttcttca gggcgtatca aatcaataaa atgtccatgt ctgtgatgac 1440 

taggaagatg ttcttctatt ttctttctct tgctgcataa cctgtcttat tcagaactaa 1500 

atttcaaatg tgaacagttt tagctgaacc actgaattaa aataattatg aact 1554 

<210> 15 

<211> 4007 

<212> DNA 

<213> Mus musculus 
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£££g£tg tgtgtacatg tgagttcagg ggcacatgga agccagaagg aggtgtgaga 
agctccacat tggtcctctc caagagcagg agatgctgtt taccactaag ccatctctct 
gtccccagga agtgacaatc ttcaggcagt ttgtctgcat tttatacttt ctatttaatt 
ataatattaa aattttgaag cacacactga gaaatctaaa ggtggtttat tcctttccca 
tttttttaaa gagagtcaaa taaaggttca tttcctcaga ggaggtcagg tgaaatacaa 
gggcttattg tcactcaaga gattcttccc aagaaaagtc acattaagaa aggaaaggag 
agagacatgt aggtcaaaac agagataatg aagtcaagca aagataaatg tgacttaaat 
aggcatttaa tgtggctgaa tcatatttaa ccatgtgttc tggacatata cccaaagggt 
gagtccggtg gtatatcacc ttaggtctaa ggggaggaag gagcattctg attgagaagt 
gttcatcact tttcttctgg gactaaggcc tttcccataa tgatggttta attcaatctc 
gtcaattacc tcttctcaga gagctgcagg ttgtgccacc ttgctggaaa tggtcaaatc 
ccattaaact tctgttcttt gtagcagagg gaaaaggaaa gatccagcga ttcaattaag 
acataaatga ccaggtgtct gtggcacaga atcagctttt ctggtctgac caagaaaacc 
caatttccat cttggctagg aagctatgtg agccctcact cacctttgct gttagcttcc 
tggccatggt taaaggctca gactgcacag gaacacacaa gagttttctt tgtattcaca 
gtggaagagc ccccctcccc acggagttgg gtggagtggc cagaagacct tctccatccc 
t gccc ttt gc tgctttgttt ccagaaagga gggggaggaa attaccaggg tatgagatgc 
tgcctcttgg gaacaaggaa attaatgcag cgacctgaat tatgtgcgtg acacatttgc 
atatcgctca ctagcagttt ctgaagaaac aacaaatttt gtgtaattct aatctctttt 
gcaaagcaag ggaagaggaa aagctacctc cgatttactg tctacaaaga aagctgaatt 
ttaggatcaa atttgctcac atttagtggt aggctaaatg ttaaaacata agggtgcatc 
ttttaaatca ctctttgtgt gtggctgagt gtaaatgtca ggcatgtgta catgcatggg 
t gcatgtgtg catgggtacg tgtatgcatg catgcttgtg tacgtgtgtg cttgtgtgcg 
tgtgtgtgtg tgtgtgtgtg tgtgtgtgtg tgtgtgtgtg tgtgcacatg tatgcatgtg 
gaggacagag ttcaatgaca ggtctcattc tet.tc.tu tcaatgataa tttatagctt 
gaatatcttg tatcattata cctatgtctc ccagatcatt atattcattt gggagggtat 
ctagtataat ttactattat tatggtaatc atggaatttg ggggatggaa tgactgattc 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
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attgtcagtt 


tataaccagg 


wyciwy Lay c< 


aaar**"*-*'"^ — > 
aaaLbCaaaL 


gaagaccaat 


tatcaatgtc 


1680 


tttgtagtca 


ataaaattaa 


t" a t* r* ♦* onra nra 
i— a i— ey y ay a. 


cty tyay CCCC 


tctaaggaaa 


atcttctagt 


1740 


cagacraaaoc 


a i~ a ana a ctnci 


dy dctCy CUtd 


a s /~t4- 4* 4— /-»4- -n 4— 

aagcccgcac 


caacttctct 


ttgtgaggtt 


1800 


taagcaattc 




dCCy y CCciy c 


accaccgcaa 


tcctcgagat 


ttctagaaga 


1860 


crcaaattaat 




ay ll c ccccc 


gaaacaagga 


aaattggcat 


agtttagaaa 


1920 


aaaaaaatgt 


u yyy a y a l. e e 


CCC ddC dd C C 


gccccccgcc 


gaacaaaagg 


ctggtacttt 


1980 


ccttaatcaa 


y i— Ly <w cty ct ctd 


4™ <~%r* 4- #-r #-» 4~ rtr^ 

eye eg LLuyi. 


cccgcgagcc 


ttgtagagag 


aaaaattaca 


2040 


cattttataa 


av-uLyycu u.y 


y ocy cccccc 


4-/^4- 4- i-r-s 4- 4- 

cgccccgacc 


4— x> — . _ _ 4_ 4_ _ i_ 

tcgacttctg 


attctgetet 


2100 


era 1 1 ccra n t* t- 


cty eedCCddd 


y ccgacagac 


tattgaccca 


cagtcacagc 


ttataccttc 


2160 


4- fii- t - 1-* t- r« r-« r» 


LCCyuCCCCa 


gaagtctctg 


tctcttcact 


cctcctatca 


cagtgttaat 


2220 




a/ra ■hria /-« 4~ 4* 4- 

dyatgacuct 


gacccaagca 


gaggcacaca 


gggtagagct 


gacatcatta 


2280 




4- /*■< 4~ 4"" pprrt - 
LCCLLCCCCU 


ccacccgagc 


4.4.1.-4.4. ML-t 

tttcttatca 


gataaaggtt 


gctagtgttg 


2340 


1— c*. v_ ciy \^ 


-a -a ana a 
c L.L.ud.aayaa 


ac aagyc u eg 


gggatgetec 


aggtactaca 


ttcagagctg 


2400 


CfOdCfCf err* t* a ct 


a t~ 1~ t" ri/'ia a a 
ClLL LyyddLd 


cyccaagcac 


tttgaggeca 


gaagcacaca 


tcttttctct 


2460 


t cert* era a era t* 


ftf *~ /™t t* ^"t +~ a 

gctytgccca 


gcaagaacaa 


cgacaacaca 


caagecatgg 


actcagtagc 


2520 


a V—- t—y aay L->— } e 


ygutcgaayt 


gggccccacg 


4 ,4-4 i4-4-4 v 

Lattacctaa 


ttgttagtct 


gaccttcaga 


2580 


ctaaaarrrh 


nna fir* t-ahaa 

yyaycLaLaa 


/"^ a a i^/^a 4- a ^ 4" 

caaygauaac 


gcacccaagc 


aacccagctg 


gtaagtggta 


2640 


aaacataaar 


L.y aw^ciu cty ct 


LLLLLdLLLL 


ccyygdCCciy 


aag c ccaacc 


agagagg t aa 


2700 


atcaaatcaa 


f* r* a a a era a a a 


C CdCy C ULLd 


a rrt* 4~ a a a f~ fr* a 

ag c Laaat.ua 


agcaccccaa 


aattatctag 


2760 


taatctcatc 




/Ta a a 4^ 4* 

yacacaty cc 


a af-<- 4~ ri 4™ a 4* 

agacccgcac 


4*^ /tz-ff* — x ^-f 4— y-i 

"-yytaccctc 


aaaatcaatt 


2820 


f- = pr«a t" a t~ cri* 


c cy cy llucu 


4~ i^r4" *^/^4" a /~r -n y^ 

cycgccagac 


cccggagcaa 


c t g 1 1 agaga 


tcaataaaca 


2880 


a Cf r* \~ c* t~ fi a <-« a 


f"»a^ , t*#~r"»a4*ay» 


ccccacgaag 


4-4-— ,4-~™ # _,4-4-— , 

ccccgccccg 


gagaatctcc 


ctagtttgag 


2940 


tttctcaggt 


taaaattctt 


ttcaagaatt 


aattaaatta 


cttcactgag 


ttttagttcc 


3000 


tgtctctctg 


ettgeagatt 


catatgggga 


cccatttcat 


cattatattg 


atttagattc 


3060 


ttgecattea 


aaacgcttgg 


catccattgt 


cagccatact 


acagaaacag 


agatgcaccc 


3120 


ttgtatagtg 


tgcagcgcac 


attccctcaa 


aatatttact 


atgtagttct 


etataegtta 


3180 


ggaaatctat 


tcattcaaca 


cagatgttga 


gectcattta 


atttactttc 


tccactttac 


3240 
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tgtatttttc 
aagatagcag 
gggaaagcag 
cgtcagcact 
ccagtttgca 
attaaaattc 
gttcctagac 
tattgataaa 
attcattgta 
atacacacac 
acaaacaaat 
aaccctctgt 
agagcccagc 



ccagtatatt 
tgagttgggt 
taagcgtgaa 
gggcactgac 
gctaatccct 
cttaagaaac 
actgtaaaat 
gaaatgtact 
tatcatataa 
agatacagac 
gctgtgctat 
ggttgtggtt 
tcctgtattt 



tttgccccgt 
aacaagtgtt 
cagggaggac 
ttaacagaac 
agcaatgtcc 
aaaataactt 
gacacacaca 
gcaaagtgtt 
gattagagtc 
acacacagag 
ttgtatgtat 
tagccatcag 
ctcggctgag 



ggtttttatg 
tctttttgca 
tgtctaaatg 
tccactggga 
tcagaccaca 
caactgtatc 
ggtgctgctt 
tcatccatac 
ttaatccttt 
aaagacacac 
aaccaacctg 
agacgagcca 
cagttctttt 



ttagaagaag 
aagtaaaaga 
actgacgcag 
cttgtctatc 
ggagaagggc 
ttccttgatt 
acctgctgct 
ccagagcagg 
tttttttcag 
agacagagtg 
ggtagtatct 
tcagagccca 
tccagct 



ggccaacaga 
cgccgttgct 
ttggtgatga 
ttctgtttgt 
taaattgctg 
ttggttttta 
gtggcatcaa 
tgctagagca 
atatacgtgc 
gcgcgcacac 
gtgtactgtg 
gcctgatgtt 



3300 

3360 

3420 

3480 

3540 

3600 

3660 

3720 

3780 

3840 

3900 

3960 

4007 



<210> 16 

<211> 2755 

<212> DNA 

<213> Mus mus cuius 

<400> 16 

gaacgatagg gccaaaaagt 
taggggactt tcaggatagc 
aaaaaaaaag tctgtttcca 
ggagagtaaa tagagcctct 
gcaagatgaa gttttaccag 
acatatctac tcacccagac 
tccaacttgg taatccaacg 
tacttaaagt agcagaaacg 
ggatgactta caaagctggg 
gttggggatt gtccttgcca 



gggagtgggt 
atttgaaatg 
tacccaaaag 
tatcaaaatc 
acctttgtgt 
agggaatcca 
aattttatta 
actgagaaga 
aaccaggagc 
ggtgcctcag 



gggtagggga 
taaatgaaga 
atattcaaat 
tagcatttgc 
gacagaactt 
tgactgacca 
cgattactta 
ctgtgtcacc 
acactacaca 
ttggtctaaa 



gcagggcagg 
aaatatctaa 
atctgatata 
cattcacttg 
ttcctgggga 
cagtacagat 
cgggcgtatg 
aaagcccacc 
gtctgcaggc 
ccttttccag 



ggggagggta 

taaaaaattg 

aagtctctaa 

agtaagatat 

tacacaatac 

accaccgaag 

gatgaggagt 

acagcatagg 

agctcaacca 

gcagttggtc 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
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tgggctcagt cttctctgca gcttggtttc tctgagagtt gacacagctc aacttccttc 660 

tgtctgagag agactttcag ctttttacaa ttcaaatgtt atcccctttc ctagtttccc 720 

ctctgaaaat cccctgtcct ctcctccctc tcgctgctcc ccaacccacc cactcctgct 780 

tcctggcctt ggcattcccc tataccaggg catagaaatg tcacaggatc aagggcttct 840 

cctctcaatg atgacctact aggccatcct ctgctacata tgcaattaga gccttgagtc 900 

cctccatgtg atttctttga ttggtggttt agtcccaggg aactctgggg ttgctggtta 960 

gttcatattg ttgttcctcc tagggggcta cagacccctt cagctccttc attggggacc 1020 

ctgtgctcca tctaatggat gactgtgagc atccacttct gtgttcatca ggcgctggca 1080 

gagcctctca agagacagtt atatcaggct cctgtcagca agctcttgtt ggcatctgca 1140 

atagtgtctg ggtttggtgg ttgtttatgg gatggatcct cagaagggac agtttttgga 1200 

tggccattac tttagtttct gctccaaatg ttgtctctaa taactcaggt attttgttct 1260 

cctttctaag aaggatcaaa gtatccacac tttggtcttc cttcttcttg agtttcatgt 132 0 

gttttacaaa ttgtatcttg ggtattctga gattctcggc taatatccac ttatcagtga 1380 

gtgtatatca tgtgtgttct tttgtgattg ggttacctca ctcaagatga tatcctccag 1440 

atccatccat ttgtctaaga atttcatgaa ttcattgttt ttaatagctg agtagtactc 1500 

cattgtgtaa atgtaccacg ttttctttac ccattcctct gttgagggac atctgggttc 1560 

tttccagctt ctggctatta taaataaggc tgctatgagc atagtggagc atgtgtcctt 1620 

cttaccggtt ggaacatctt ctggatatat gcccaggaga ggtattgcag gatcctctgg 1680 

tggtactata tccaattttc tgaggaacca cctgactgat ttccagagtg gttgtacaag 1740 

cttgcaatcc catcagcaat ggaggagtgt tcctctttct ctacatcctc accagcatct 1800 

gctgtcgcct gagtttttta tcttagccat tctgacttgt gtgagatgga atctcagggt 1860 

tgttttgatt tgcatttccc tgatgattaa ggatgttgaa cttttttttt tttttttagg 192 0 

tgcttctcag ccatttggta tttctcagtt gagaattctt tgtttagctc ttaccccact 1980 

tttaaatagg gttatttggt tttctggagt ccaacttctt gagttctctc tctatatatt 2040 

ggatattagc ccccttttgg atttaggatt ggtaatgatc ttttcccaat ctgttggttg 2100 

ccattttgtc ttattgacag tgtcctttgc cttatagaag ctttgtaatt ttatggcatc 2160 

ccatttgttg attcttgatc ttacagcaca agccattgct gttctgttca ggaatttttc 2220 
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tcctgtgtcc 

cagcactgct 

agtttctgat 

ttcacatgtg 

actcagaaaa 

gtttttgttt 

gtagaccaga 

attaaaggca 

aactaatcaa 



atatctttga 

tttgcttact 

gctgttttaa 

ataccaacca 

ccctcttcag 

ttcagataga 

ttggccttga 

tgtaccacca 

ccaaacaacc 



ggctcttcct 

ctgtcaggga 

tatggacaaa 

gggtcacatt 

attcccaacc 

gtgtctctat 

atccatcaag 

tgccaggtta 

aaccaaaaaa 



cactttctcc 

cgggttgaat 

actgttacat 

tgttaacctc 

taattactct 

aaagccctgg 

atccacctac 

aaaaaaacca 

caaacaaaca 



tctgtaagtt 

caatctggtc 

aaaacacttt 

cagagagatg 

caaacagaaa 

ttatcctaga 

ctctgctttc 

cacatacaaa 

aacaaacccc 



tcggtgtcgc 

cgtttcagag 

tagtcaaacc 

aacaaagtac 

tactactttt 

aatggcttat 

tgagtgctgg 

aataatacaa 

aaacc 



2280 

2340 

2400 

2460 

2520 

2580 

2640 

2700 

2755 



<210> 17 
<211> 1811 
<212> DNA 

,<213> Mus musculus 
<400> 17 

ttttacatgt acctagagaa 



ttgagtgtct gactgaaagt 
gcagcaaata agatgggaga 
ccagcatttc tggtgttgaa 
tgccagagaa aaaacagaga 
tgaggcaagc tttgaagcta 
ccctcgagag tactgtgtct 
attgttctta tgtacccagg 
tactgctcct ctggtggctg 
tttcggtctt agaacatcca 
ttttgtgtgg ctgcatctct 
tccttttgtc tttagtatct 
ttaagcaaat gctgccttct 
cctgtatgtt tgctctctgc 
aaatgttatg agtacacatt 



agaaaacaaa 
gtagccctgg 
agacactcag 
ttaaactgac 
acattgcaca 
gggtgtcaat 
gtgaggcctg 
agaagtcatt 
acaaacaaca 
catgtggcag 
ttctccagtt 
ttgttgttgt 
tgtcagcctt 
attcttacat 
tgtcttcatc 



aaaacaaaaa 
gactccatgt 
gtatggctgc 
atggtgaaag 
gtaaacctga 
atctgtgttt 
aagttcagat 
gtgattctta 
tggtcccaaa 
ggcacattgg 
ctctcttaac 
tagatgccat 
gtgattctgt 
ttcatattgt 
aacgaaaatg 



aaccaaaggc 
gtaacagtag 
attgaggaaa 
tcaagctttg 
gtttaaatgg 
tctgtaacta 
ttctctccag 
gaatggatca 
agaatttcag 
ccctttctta 
tggagcatgc 
ctgaagttac 
gtgcatttca 
ctcttcttcc 
catgtgaaaa 



tcattacatg 
caatgggatt 
ctaatcatat 
gttttgtaaa 
cctcagtgtt 
atctgaagag 
agttcttttg 
tttgagtgat 
gggtaagacc 
cccagaactc 
ctattctttg 
actcacaagt 
tgcaagggaa 
accacatggt 
cctctctgtc 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
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cttcttgcat cttagctttg ttttgttctc acatttttac tgatattcca caagaaagat 
ttgccatatc catttaggaa acattgagcg atagatagtt ctcctagaaa ttctcaaaga 
catgttattt accatatcaa actctcagga atgtattgga aaattatcct agtaacattg 
ctgttactta tacctgctgc tggaggaaaa aagtttattc atgacacatt tacctttgat 
cataaataaa agagaaaggg ccaccttttt ggagttagtc atggtagtca ttagtgatat 
ttttgaaccg tttttaattt gaaatacttc aaaggaagta aaggtcatgg cttagctcaa 
aaaaatgctc cagaagttgg gctgcttaaa tcatcagtac aataatatac tgtgtgtgcg 
tgcatgcgtg tgtgtgtgtg tgtgtgtgca tgcgtgtgtg tgtgtgtgtg tgtgtgtgtg 
tgtgtgtgtg tatgaaatga gaattgccac ataattggag catttgcatt catccgcaaa 
tcatgttgag acaaactgta gctggccgtc ttgattaaag ccaagtggtc cctgtggctg 1500 
tgaggaagag tttttcttgc aaaaactttg gcaagtgatg acttcgaaac ttacaaaggc 1560 
tattgccttt ttttttttta acaccagtag tgaccttcag ttccttcagt ctgagtttaa 1620 
gggtagaatt ctaaatttgt attctaatct gtcttttgtg gaaaattttg aaaatagtat 1680 
gtatatgtaa tattgtatat gcaaattgtg ttgttttact tgttttgcat atgaccagca 1740 
ctgactgaaa ggcatgttta actataaaca ctgttgcttt ctttgtgaaa tgaaaataaa 1800 
agtatttaaa t 



960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 



1811 



<210> 18 

<211> 2438 

<212> DNA 

<213> Mus musculus 

<400> 18 

gcaccgtttg gtgtgtgtgt gtgtgtgtat gtgtgtgtac atgcgcgcat gtgcatttgt 60 

gtgtgtgtgt gtatgtgtgt gtatttgtgc ttacacatgt gcacgtgaaa gtatgtgtat 120 

atatgaatgt aagtgtgcat gtgtatgtgt gtgcatgtgt gtaagtatgt gtgttaattg 180 

tgttttttgt gagcaagtaa ataaataaat agttctttct taaagcagca aaagaaaagc 240 

agcaagtcat gtaaaaataa gtcccatcag aataacagca gatttatcag tagttatctt 300 

acagacaagc ggaggatggc attatatact caaagctgta aaaaataatt tttcaatcaa 360 

aatataccta gaaaagctat cccccaagga tgaaagagaa gtgaaaactt tcctaaagaa 420 
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aacaaaaact caagacattt gtcagcattt tatcagctct acaagagatg cccaagggag 
cagtttatag aaattgaagg atgactacca atatacaaat gcacaaaact acaaaaaaat 
gtgcaaaata gtgctaaaat atgcaggtga atgaggaatc taactctatt tgtacaaaga 
aatcaccaag gcacaaaatg aacaagaaag gtgggaacaa aaaaaggata cacaaaacca 
gaaaaaaatg acagtactga ctattcattg ttaataatgt taaatattgc actggcagtt 
cttctctttt ttttttattt tattagatag tttctccaat tacatttcaa atgttatccc 
ctttcctggt ttcccctctg aaaatccctt tcccttaact ccttccctct ccccctgctc 
accaacccac ccactccaac ttcctgtccc tagcattccc ctacactgga gcatggagcc 
ttcacaggat gaagggcctc tcctcccatt gataaccaac taggccatcc tctgctgcat 
atgtagctgg agccatgagc cccaccatgt gtattctttg gttgttggtt tagtccctgg 
gagctctgtg gatagtgctt agttcatatt gttgttcctc ctatggagct gtaaacccct 
tcaactcctt tggtcctttc tccagctcct tcaatgggga ccctgtactc agtccaatgg 
atggctgtga acatccactt ctgtatttgt caggcactgg caaagcctct caggggaaag 
ctatatcagg ctcctgccag caagcacttg ttggcagcta caatagtgtc tgggtttgga 
tggatcccca ggtggcacag tctctggatg gtcattcttt cagtctctgc tccccatttt 
gtctctgtaa ctccttccat gggtattttg ttcccacttc taggaaggat: cgaagtatcc 
acactttggt cttccttcct cttgagtttc atgtggtttg tgaattgtat cttgggtatt 
tcgagcttct ggctagtatc cacttatcag taagtgtgta tcatgtgtgt tcttttgtga 
tt gggttacc tcactcagga tgatatcctc cagatcaatc catttgccta ggaatttcat 
aaattcattg tttttaataa ctaagtggta ctccattgtg taaatgtacc atattttttt 
tttatccatt cctctattga ggggcatctg ggttctttcc agcttctggc tattataaat 
aaggctgcta taaacatagt ggagcatgtg tccttattac ctgttggaga atcttttgga 
tatatgccca ggaatggtat ggctgggtcc tcaggtagta ctatgtccaa tcttctgagg 
aaccgccaga ctgatttcca aagtggttgt accagcttgc aaccccacca gcaatggagg 
agtgttcctc tttctccaca tcctcaccag catctgctgt cattggagtt ttttatctta 
gccattctga ctggtgtgag gtggaatctc agggttgttt tgatttgcat ttccctggtg 
actaaggatg ttaaacactt tttaggtgct tctcagtcat tcagtattcc ttagttgaga 



480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 



33/186 



WO 2005/005597 



PCT/US2003/027106 



gttctttgtt tagctctgta 


cccccatttt 


ttaatatggt tttttgtttt tctggagtct 


2100 


aacttcttga ggtctttgta 


tatattggat 


attagccctc tattggattt aggattggta 


2160 


aagagatatg agcacagggg 


gaaaattcct 


gaccagagca ccaatggctt gtgctgtaag 


2220 


atcaagaatt gacaaatggg 


acctcataaa 


attgcaaacc ttctgtaagg caaaggacac 


2280 


tgtcaataat accaaaaggc 


agttcttatt 


tctttttctg gttttttttt gaggcagcag 


2340 


agggagaaga gtgtcagcga 


gggtaatttt 


tggtcttagg agatatttag ggttgctgta 


2400 


taaagcatct tcttgtatta 


agtctaagtc 


gatttagc 


2438 



<210> 19 

<211> 1712 

<212> DNA 

<213> Mus musculus 

<400> 19 





a ci t* 1~ t* c i~ r* n t~ 


c yg3 ac g9 a -g 


caggaggcac gcggagtgag 


gccacgcatg 


60 


aaccaaaact* 






aagagtctac 


atgtctaggg 


tctagacatg 


120 


ttcagctttg 


tggacctccg 


gctcctgctc 


ctcttagggg ccactgccct 


cctgacgcat 


180 


ggccaagaag 


acatccctga 


agtcagctgc 


atacacaatg 


gcctaagggt 


ccccaatggt 


240 


gagacgtgga 


aacccgaggt 


atgcttgatc 


tgtatctgcc 


acaatggcac 


ggctgtgtgc 


300 


gatgacgtgc 


aatgcaatga 


agaactggac 


tgtcccaacc 


cccaaagacg 


ggagggcgag 


360 


tgctgtgctt 


tctgcccgga 


agaatacgta 


tcaccaaact 


cagaagatgt 


aggagtcgag 


420 


ggacccaagg 


gagaccctgg 


cccccaaggc 


ccaaggggac 


ccgttggccc 


ccctggtgaa 


480 


cctggcgagc 


ctggcggttc 


aggtccaatg 


ggtccccgag 


gtccccctgg 


ccctcctggc 


540 


aagaatggag 


atgatgggga 


agctgggcaa 


gcccggccgt 


cctggtcccc 


ctgggccccc 


600 


cggaccccct 


ggccttggag 


gaaactttgc 


ttcccagatg 


tcctatggct 


atgatgaaaa 


660 


atcagctgga 


gtttccgtgc 


ctggccccat 


gggtccttct ggtcctcgtg gtctccctgg 


720 


cccccctggt 


gcacctggtc 


cacaaggttt 


ccaaggcccc 


cctggtgaac 


caggaacacc 


780 


aggaggagga 


ggagaagaaa 


taatgagtga 


ttgtgtctcc 


gtttatggaa 


agagtttgat 


840 


ggggtactaa 


tgttggtaaa 


tataatattc 


aaacatgaaa 


ttctaataaa 


ataagtgaaa 


900 


gatatcagaa 


agccttcaaa 


atcctgcaaa 


cacaatatac 


agaatatata 


ttaaatttaa 


960 


ttgacaacta 


taccttctta 


gatctattgt 


tcatttgata attatttcaa 


attttttctt 


1020 
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ctccatttaa 
gtgctttcat 
gtatagcatc 
tgccatctgg 
atcactacag 
gactttcatt 
tgctaacagg 
tttttattta 
tatttaaaac 
catattgtct 
caattgaaaa 
cccaaacaaa 



tagtatggct 
aaggcttaaa 
tttgaagaaa 
gacatcaagt 
aagctgatca 
tttgcttggc 
tttgcttaaa 
cagcaagata 
aatgtttaaa 
cccattttct 
caaacccctc 
caataaagaa 



cttaagagac 
atgtttatgt 
tgtgtactca 
tcatttgtga 
ccacacttaa 
ttgtgtataa 
accctcttag 
tttaaacact 
tcaaatattc 
acccaacttt 
aaaatcagga 
aacaacaaaa 



ctgcagtgta 
attttttttt 
aagaattact 
acccatgctt 
ctagtggtat 
aattgaattg 
aaataacatt 
gaatatattt 
aattatatta 
aagttctttg 
aaacacattc 
at 



tgttaagact 
aattttctat 
catgtagaat 
cctctcagca 
tgtaaaatct 
aaagtttata 
tttattactt 
tgtagtattt 
tgccattcct 
tcaataaaga 
aaaacaaaca 



atttggattt 
ttgtatcttt 
ctcactgtgc 
tcataagaag 
cttcattaat 
ttatctcatt 
gtcatgtttt 
aatttttcat 
ccaagttctt 
tacaaaaatc 
atcagaaaac 



1080 

1140 

1200 

1260 

1320 

1380 

1440 

1500 

1560 

1620 

1680 

1712 



<210> 20 

<211> 3651 

<212> DNA 

<213> Mus rausculus 

<400> 20 

aaaagactgg ctagttgaga atgcaccagg 
aggttttgcc actgcctttg actttttata 
atacctagaa aaaatttgga ttgttactga 
atggggcaaa caacttcttc ataaccatca 
ggccttggtt actggagtag ataaaggatc 
tgatgtgatg gaaaagacta tgtttctctt 
aggtgtggcc tatggaagct atacctcaaa 
acgccatttt aacatcaaca actttgataa 
ttatgctaca cttttgccag gctatcaaag 
ttggttttat ggtccagaga gccagctagt 
agctggaaat tggttagctc agcaaattag 



ggatgaggtt 
taatctatta 
ggaaatgtat 
agctacaaat 
taaagcaaac 
gaagcatatt 
atcagttaca 
taactggcta 
aactgtaggc 
tttcttggat 
aaagcatcga 



ccagttggcc 
ggtaatcagc 
gaatattcca 
atgatagctt 
atatggaaac 
gtagatggct 
cagtatgttt 
aaaatgcatt 
atagcagatt 
aagttcattt 
cctaaggatg 



attctttaac 
gtaaacaaaa 
agattcgatc 
tactcatagg 
aagttgttgt 
cattggatga 
ttttggcaca 
tttggtttta 
ccaattataa 
tacagaatgg 
gaccaatggt 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
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tccttccact 


gctcagcggt 


ggagtactct 


tcatactgaa 


tacatctggt 


atgatccaac 


720 


actcacccca 


cagcctcctg 


ttgattttgg 


cactgcaaaa 


atgcacacat 


ttcctaactg 


780 


gggtgtcgtg 


acttatgggg 


gtgggctgcc 


aaacacccag 


accaatacct 


ttgtgtcttt 


840 


taaatctggg 


aaactgggag 


gacgagctgt 


gtatgacata 


gttcactttc 


agecatatte 


900 


ctggattgat 


ggatggagaa 


gctttaaccc 


aggacatgaa 


c ate c agate 


aaaattcatt 


960 


tactttcgct 


cctaatgggc 


aggtattcgt 


ttctgaggct 


ctttatggac 


caaaattgag 


1020 


gccaccttaa 


caacgtattg 


gtgtttgccc 


catcaccatc 


aagtcaatgt 


aatcagccct 


1080 


gggaaggtca 


actgggagaa 


tgtgcacagt 


ggctcaagtg 


gactggggaa 


gaggttggtg 


1140 


atgcagctgg 


ggaagttatt 


actgctgctc 


aacatggtga 


taggatgttt 


gtgagtgggg 


1200 


aagcagtgtc 


tgcttattct 


tctgccatga 


gactgaaaag 


tgtctatcgt 


gctttacttc 


1260 


ttttaaattc 


acaaactctg 


cttgttgtcg 


atcatattga 


aaggcaagaa 


acttccccaa 


1320 


taaattctgt 


cagtgccttc 


tttcataatt 


tggatattga 


ttttaaatac 


atcccataca 


1380 


agttfcatgaa 


tagatataat 


ggtgccatga 


tggatgtgtg 


ggatgeacac 


tataaaatgt 


1440 


tttggtttga 


tcaccatggc 


aacagtcctg 


tggctaatat 


acaggaagca 


gaacaggctg 


1500 


ctgaatttaa 


gaaacggtgg 


acacagtttg 


ttaatgttac 


atttcatatg 


gaatccacaa 


1560 


tcacaagaat 


tgcttatgta 


ttttatgggc 


catatgtcaa 


tgtttccagc 


tgcagattta 


1620 


ttgatagttc 


cagttgtgga 


cttcagattt 


ctttacatgt 


caacagtact 


gaacatagtg 


1680 


tgtctgttgt 


aactgactat 


caaaacctta 


aaagcagatt 


cagttacctg 


ggatttggtg 


1740 


gttttgccag 


tgtggctaat 


caaggacaga 


taaccagatt 


tggtttgggt 


actcaagaaa 


1800 


tagtaaaccc 


tgtaagacat 


gataaagtta 


atttcccctt 


tgggtttaaa 


tttaatatag 


1860 


cagttggatt 


cattttgtgt 


attagtttgg 


ttattttaac 


ttttcaatgg 


eggttttace 


1920 


tttcctttag 


aaagctaatg 


cgctgtgtat 


taatacttgt 


tattgecttg 


tggtttattg 


1980 


agcttctgga 


tgtatggagt 


acatgcactc 


agcccatctg 


tgcaaaatgg 


acaaggagct 


2040 


gaagctaagg 


caaatgagaa 


ggtcatgatt 


tctgaagggc 


atcatgtgga 


tcttcctaat 


2100 


gttattatta 


cctcactccc 


tggttcagga 


gctgaaattc 


tcaaacagct 


ttttttcaac 


2160 


agcagtgatt 


ttctctacat 


cagaattcct 


acagcctaca 


tggatatccc 


tgaaactgaa 


2220 


tttgaaattg 


actcatttgt 


agatgcttgt 


gagtggaaag 


tatcagatat 


ccgcagtggg 


2280 
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cactttcatc 
caaaacatcc 
aaggacaaaa 
aaaggaccat 
tacccaagtg 
ttttttcagg 
gcttggatct 
ccagagcact 
tctggctatg 
gctatctcct 
acagatttgc 
cagaagacta 
aaccaaatgc 
tcaccatcta 
gaaaacattt 
ctgcaggtcg 
aaactttgtt 
tgttatttgc 
tttatgttac 
tacagaggta 
tctttggaaa 
tgctgtatct 
cgtctaacat 



ttcttcgagg 
atctacatga 
agcgaaaatt 
ttgatagaga 
cacgtcctgt 
aagttttagg 
attcagtgct 
tagcaaaatt 
cttttgagta 
tattatctca 
tgcctaccaa 
ctgaaaggat 
tatttgccac 
atactaatat 
gctggacact 
gcaaaatttg 
tattcttgta 
acagagatat 
cacttttctt 
tatattctgg 
ctatttatct 
agtcatctct 
ttaccttgca 



gtggctgcag 
aaccagtagg 
aaaaagaagg 
tgctgaatat 
gctcagctta 
aacttcaatg 
atatggtagt 
gtttaaaata 
tgaatcactg 
tttgtgggta 
ttaccatctg 
ttttgctttc 
ttccacaaac 
ttggaaaaca 
gatggatcat 
cactaatgtg 
catgtatgta 
tttcaaaata 
gcctttgttt 
ggttctgaaa 
atcttaggac 
cgcctcttaa 
cacaaaaacg 



tctttggtcc 
agtaaactgg 
gagtctttgc 
attagggctt 
agtagtggta 
cgggcattgt 
aaaccaagtc 
gaggaaggta 
aagaaagaat 
gcaaacactg 
gtcaagtttg 
cttggcattc 
cttttttatc 
aacttgccta 
ctaggatatc 
tcccaaccta 
tgtgtgtaga 
ggcaccatat 
ctgaattttt 
tatggggttt 
ctcaaacact 
tatggactac 
agaaataaaa 



aggatacaaa 
cccaatattt 
aagttcaaag 
taagaagaca 
gctggacatt 
acatagtaag 
tttattcttt 
aaagcaaatg 
tagaaatatc 
cagcagcctt 
aagatattgt 
ctttgtctcc 
ttccatatga 
gagatgaaat 
caaagtttat 
ctttgtggat 
gtgagtgcgt 
ttggcctagc 
ttctgctaaa 
taatggactt 
acaaacggcc 
aaaactttat 
aaacaaaaat 



acttcacttg 
tacaactaat 
aagtagaata 
ccttgtttat 
gaagcttcat 
agaccctcga 
gaagaatgta 
taattcgaat 
ccaatcaaat 
gagaataaat 
tcattttcct 
tgctagttta 
gggggaaata 
taaactaatt 
ggactaaatg 
atgaactaga 
gtgtccagta 
aggatttatt 
atgtttctgc 
taactcaact 
ttgcaattgc 
gttttgaaaa 



2340 

2400 

2460 

2520 

2580 

2640 

2700 

2760 

2820 

2880 

2940 

3000 

3060 

3120 

3180 

3240 

3300 

3360 

3420 

3480 

3540 

3600 

3651 



<210> 21 

<211> 2205 

<212> DNA 

<213> Mus musculus 

Icatcctctt aaataatctt accaaggaat aatcaggaac agtcacgctt ctgtgtccct 



60 
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4— 4— m I— i j i 

rtctgfcfcfcfcg 


ctaaagctaa 


gcttatgtac 


caagttagaa 


catcaatgac 


attaaatgtg 


120 


gagacctttg 


ctactttttg 


ttagagggca 


tccctatatt 


tgcttgcttt 


tattttaaac 


180 


cgtggaaatc 


tgaatccaga 


tagaaacaaa 


attagggttt 


tttccatacc 


acatgctagc 


240 


atttgcactg 


attttcatag 


gaaaaaaaac 


ttcaaacaga 


acaaaataaa 


aacatgtagc 


300 


ctatagccat 


tcctatttaa 


aatgattggt 


ttccttggct 


aggtaaaatt 


ctgcatgatt 


360 


aaattgccaa 


taattctgac 


atttgggttt 


ttgcatagat 


tttccaaaat 


ttaggtccta 


420 


agttgttatg 


gtaacttttt 


tttaaagaaa 


gtttaatttt 


aaattacaga 


tggattttgc 


480 


tgggcattag 


caatttgtgt 


ttatttagaa 


aatagagtgc 


tcttattttt 


gtaaatgtct 


540 


cacggaaata 


actaaatttg 


tttataaatt 


gagactacta 


aagcacaatc 


gttgaagcca 


600 


tagagaacat 


cttgaaatac 


agttttaagt 


ggagaatttt 


aggaaactta 


cataatatca 


660 


taactcaaat 


atatttaaat 


tgcaattctc 


tcagccttta 


tactcatgtg 


ctgtatacac 


720 


agttactcta 


aacaatgtaa 


gagacatata 


cagtagcccc 


tagagttatg 


aatttttaag 


780 


tcaattaatt 


tccatgaaga 


aaattgagaa 


tggctgttta 


tgtctgtata 


tggtgtgatc 


840 


ctctagttgg 


tgcatgcatg 


tgtgcatgca 


tgtgtgtatg 


tgtgcatgta 


cacgtgcgtg 


900 


tgtatgtttg 


tgtgtgtgtg 


tgtgttgagt 


tttctcccaa 


tccttgtaat 


ataggaaatg 


960 


aacacttatc 


caaatgttga 


gagttcattc 


acaccgcatc 


tgagttacta 


ggtcctggga 


1020 


cagtggatat 


aggtattttt 


ctcttttgtg 


gccaatttat 


ttaaatataa 


aacaatggtg 


1080 


ttagtcttag 


taggagttta 


gagtgacaaa 


gacttaaaat 


ttcctttgag 


agtggtattg 


1140 


ctcatgccta 


gcatctttgt 


gtatgtgcag 


aaaaggagag 


tagtgttagg 


ggctgctgag 


1200 


actatgggga 


gaaatgatga 


tacattgaag 


agctaggtct 


agggagagaa 


atcaaaatac 


1260 


tcttgaaaga 


taggaaaaca 


ttgacatagg 


gctacctcat 


atttttttta 


tttatttgca 


1320 


tgaaataaaa 


ctagaattat 


aaaattcaca 


ttctcaattg 


ggatattata 


tttgttaatc 


1380 


cataaaacta 


tttacattgt 


atgtggcaag 


ttgtagtcat 


ttttaagagt 


tagactctta 


1440 


ttgcttccaa 


ccaagaaaaa 


taaatgaatt 


cagtctagaa 


ttggcaagag 


taatgaagta 


1500 


ataattgtaa 


aaattgttgg 


agtatgtctt 


cctgagaatc 


atagtctcct 


gtatagcttg 


1560 


actggccttg 


aagaacagtg 


gtgacaacag 


agcctgatgt 


ctgttgcacc 


tttgagtctg 


1620 


tcattatttt 


atagacaaga 


tctgaagctc 


tgataaccct 


ggctaaaaac 


atttttaaga 


1680 
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aaagaaagtt 
gttatattta 
ccttttcttt 
tttttcttcc 
aatatgaaac 
tgatgcttag 
gcatgctatg 
tgagcttctc 
ttaaagctgc 



actttaatat 
ttcaatggaa 
aaatatatta 
cctataaaat 
tgtgagtctg 
ggcttcttgc 
gtttgactaa 
tcagtttgaa 
agactttaca 



tatatattat 
aacactgtat 
aatgaaattia 
gagtttctgt 
tgatctgaca 
acctaaagta 
tgtcgaatgg 
gaaggcaagt 
ttagggaaca 



tgttttactc 
ttgtattgga 
aaaaacatat 
ggttcagaga 
gatcaactaa 
ggaattatta 
ttgctgtgca 
gaagtaccaa 
cactcagata 



atcaatagtt 
tacaaacatg 
ggggctttcc 
caacctgaca 
aatatgaaac 
gaaccagttc 
tcagaaaagg 
gcaggtatta 
aaact 



attgcttatg 

gatttgatat 

ttggggttga 

gatcaactaa 

tgtgagtctg 

tttgttcaat 

aagatattat 

gcagagttat 



1740 

1800 

1860 

1920 

1980 

2040 

2100 

2160 

2205 



<210> 22 

<211> 4059 

<212> DNA 

<213> Mus musculus 

<400> 22 

aaatatatat ggaagtgatt ttcttaaacc 
aattcttatg taacagatta actccaactg 
gattggcatg tctgttgtat ttcttgacag 
tttccaggat ggtgctttca tttagtgcag 
tcccagacca aagctggtag gcaaccttat 
ggtgttacct gctgggtaag cactaaagag 
ccctaactgt gcttcttcac cctgcctcct 
cttccttctg ctctgccggt gaccacagaa 
atagtgaaac agcctcacag gcaggttatg 
cacaccatgg ctttaggtat ggaagaacat 
gctactgcag tagcagtata actaaatgaa 
ctgtccagtg cttcttgctc agctggggct 
accaataagc tctcaccttt tccacttggt 
aatttcttcc ttcttgtctc aagaagtctt 



cttaaaagac 
ctagaaggta 
cttgaaatga 
tctactctag 
actttgccac 
agagccgtca 
gtgctttctc 
gtctggaaag 
tttcgttctt 
gtctaacaac 
tgactaaaat 
ctctcctctc 
agtctaaagc 
taatctattt 



caatcaaatg 
acatgaaaaa 
tgtgtgttga 
ctggttgctc 
tatggcaacc 
ggccgctctg 
tgatcaacca 
caagtgatga 
ctgaactcag 
tccctggacc 
gcccagcttg 
actccatagg 
ttagcttgat 
ctatcattct 



tggtgttact 
tgcccagctt 
gaaatgaatc 
tgaattttag 
ccgcattgct 
cccattccat 
tcgacacccc 
ccatgagcat 
tccactaaca 
agggtgtttt 
atttaactgc 
gagaagaaaa 
ccccatatgt 
ttaaatcctg 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
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actagtctga 


gttatttctg 


tctttacaaa 


gtttcactac 


ttgttgctta 


aaggataaaa 


900 


atctctgctc 


atcagcatag 


cagcaataat 


tcctcatgcc 


tggccttttg 


atttctttgt 


960 


cccatgatga 


tcttcagcca 


gttccagatg 


gtcatctcct 


ctctgtagaa 


ggaacccaca 


1020 


tgttcaaggt 


cctactcaga 


aaaggctctt 


cagttgccaa 


gatgcactgg 


atcctacaca 


1080 


ttatagttat 


tcctgtgtct 


ttgttcatcc 


attaatgtgg 


tgcttataat 


agtgggtatt 


1140 


agtatggtca 


tcatcattga 


cttcctagct 


gtccaagtcg 


caagaagaga 


ttgtctgcat 


1200 


gagtttgtat 


ttcctgagtt 


gttccaaact 


ttttggaata 


cattttttct 


taaacacaaa 


1260 


tagtctcatc 


cttccacaaa 


ggctctgagc 


tctactcata 


gtttgtctct 


taaccccaag 


1320 


ctctctgggt 


tggtcaagca 


gtcacatgtg 


ccatcctact 


agccatggaa 


tgaatgactt 


1380 


tgatatcaac 


tcaaaaatct 


atgaccagtg 


aaagccaaag 


acacttggtt 


aggagactct 


1440 


gatttaacct 


ttcagggaaa 


ctgaatgtaa 


gaaagaaaga 


ctatgaccca 


gctccggtat 


1500 


ctagtgaacg 


gagagacaga 


aaagctgccc 


ttggttcttg 


tgggtccccg 


gaatgcaccc 


1560 


cagtagcctt 


ctgctggacc 


cttttgtatg 


taggtcaaag 


gattggcttg 


gttctttgca 


1620 


acagcaagga 


ttctaacgtc 


tgtaggtatt 


ttcctttgag 


gcttgagatc 


tattgggtgt 


1680 


agaagctcag 


tccacaccct 


gttcaaaatg 


gtgtaagact 


tctaaattcc 


cttgagtcaa 


1740 


ggcagaactg 


tggccttaca 


agaataaagc 


ctatcttcgg 


tgcctaccta 


atatccctaa 


1800 


ctaggctctt 


tgaccttggc 


cctttgtcct 


ataactccct 


cacttagaat 


gctttctctt 


1860 


ctcttcttac 


ctaactacaa 


cttgtacctc 


agaacctacc 


tcaggtgacg 


catcatcagg 


1920 


aagaccgctc 


ctcataccct 


aagtcagccc 


atgtcacacc 


actgtcctac 


ttgtctttgt 


1980 


tctctgctac 


atggtaactc 


ctcaagggcc 


aaaaatggtc 


tcattttagc 


tttgcacttc 


2040 


tagagccaaa 


cagtccttgc 


cccattacag 


gcttgctaaa 


tgtttataag 


tgaatggatg 


2100 


gggccatcta 


actcagtagt 


cattttcctc 


ttgatggcga 


cactgttcag 


agtgacatct 


2160 


tgtccagcat 


catgtagcca 


ttttaccttt 


agtttactaa 


agagaaagtt 


ttaaagggaa 


2220 


taatatcttt 


agaggagaaa 


attaatgctt 


tattttttca 


tttaagtgaa 


tacatatata 


2280 


gtaaaactga 


acatatgtaa 


agtagaattc 


tctttaagtg 


taccgtttgg 


tataaatgga 


2340 


tgtacccatg 


tcacctcttt 


taataattga 


ggtagccgaa 


tgttcccctc 


acctaagaca 


2400 


gtagatccct 


atatgccctt 


cctattcagt 


ctcctgcaca 


ctcacagatg 


gggactccaa 


2460 
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accgtcacag aactgtggct agccctaaag actagccttt cttctctaga ggttcaactc 
tatactcttg tgtctggctt gctctacttg gcataatgtg ctgggattca tactgtggca 
ttggtagttc actgtcgatt gtggctgaat acatatcttc atagaggcat gcatcttcct 
tcccctgagg caaatgccaa gtataaccga tactgtgagt catatatccg tacttgttta 
ccttcctaag aagtcttcag gttcttgtgt gaagtgattc caccatttac actcccacca 
acagtggagg agttgcagtg gttccccgtc ctggttgtca atgtattaat taagttaata 
ggtgggcatg ttttattttt tctgttattt cccccgttag atagaacagc accttattat 
gcttggtggc tgtttataca taagttatta ttttatagtc tactctttgg tccatttaaa 
tcagttacct tcttattaca ttttaagact tatttattgt ggatatagca tagaaatccc 
attatgcatt ataccttatt ttttattaac tatacccttt ttctttttaa tttttattta 
aaaatttacc cctcactcat atttacataa aatttattat tcccccccgc ccccgcacaa 
cactaattac ctttttcaac agcaaaagct tttatttggt ggaaaaaaaa ttgggcattt 
ttttttttct tttaaagggt ttttgtgtac tatctataaa atccttgtgt aactcagtta 
cacatacttt ccccatgttt tcctctataa attttataaa tttagttcat tcatttagag 
ctatgtttct ttttctttag ttcctttaag gtatgtgtga ggttaagggc taaggttcat 
gttttcactc tggtgcccac ctttgctaag caggccacct ttccatgctg catcacgcct 
ctataaagtc aggaggaggg actcgctggc atctgacttg acttcatttc ctgtggattt 
gtgtctgttc tcactgcaat agtatagcat cttgattttt gtggatttag agcgagtctt 
gggattgaca gtacaactgt tcacttaggt caaaattatt taagctatgt cagtattatt 
gcttataact tttagaatca gcattttggt tttttcacaa aatagtttgt taggattttt 
ttttaataga cagtattagc cctataatag gcttagggac aattgatatc atagtattga 
gc tatccaat ccatgaacac agtatatctt tttatttact tagctctttt aaattttcac 
tcaataatat attttagttt tgagtgggca gttcttgcat tcattttgtt ttaaaatatc 
cctagatatt tcacatttct gttattgtaa attatatttt taactttaaa tttccagtgt 
cttgttaata tagacaaata caactaaacc ttttatattg accctgtatc ctaaaatctt 
gaaaaactta atacttctgt tatctttttt ggtagcttcc ctaggatttt ctacatatat 
aatataccaa ctgcaaataa agattatatt gcttcttcc 



2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3060 
3120 
3180 
3240 
3300 



3360 

3420 

3480 

3540 

3600 

3660 

3720 

3780 

3840 

3900 

3960 

4020 

4059 
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<210> 23 

<211> 1496 

<212> DNA 

<213> Mus musculus 








<400> 23 
gattctgagc 


aaacacggac tgccacaacg 


gaggtcctag ccaccttctg atattgactg 


60 


tgaccactgg 


atacagaaat ggctaacaat 


ttcactaccc 


cactggcaac gtctcatggc 


120 


aataactgtg 


atctctatgc ccaccacagc 


acagccaggg 


tattaatgcc tctgcattac 


180 


agcctggtct 


tcatcattgg gctggtggga 


aacctgctgg 


ccttggttgt cattgttcaa 


240 


aacagaaaaa 


aaatcaactc aaccactctc 


tattcaatga 


acttggtcat ttctgacatc 


300 


ctgtttacca 


cagctttacc cactcggata 


gcctactatg 


cgctgggctt tgattggagg 


360 


ataggtgatg 


ccctgtgccg ggtaactgct 


ctggtgttct 


acatcaacac gtacgcaggt 


420 


gtgaacttca 


tgacttgctt gagcatagac 


cgcttcttcg ctgtggtgca cctctgcgct 


480 


acaacaagat 


taaaagaatc gaatacgcaa 


agggtgtctg 


cctgtccgtc tggattctgg 


540 


tctttgctca 


aacactgccg ctgctcctca 


cccctatgtc 


taaggaggag ggagacaaga 


600 


ccacttgcat 


ggagtatcca aactttgaag 


ggacagcgtc 


cctgccgtgg attctgctcg 


660 


gagcctgtct 


gctgggctac gtgctgccta 


tcacagtcat 


tctcctgtgt tactctcaga 


720 


tctgctgcaa 


actcttcagg actgccaagc 


agaacccact 


caccgagaaa tctggtgtga 


780 


acaaaaaggc 


tctcaacaca attatcctca 


tcattgtcgt 


gttcatcctg tgcttcacgc 


840 


cctaccacgt 


ggccatcatt cagcacatga 


taaagatgct 


ctgctcccct ggagccctgg 


900 


agtgtggggc 


gagacattcc ttccagatct 


ctctgcactt 


cacggtgtgc ctgatgaact 


960 


tcaactgctg 


catggacccg ttcatatact 


tctttgcatg 


caaagggtat aagagaaagg 


1020 


tcatgaagat 


gctcaaacgt caagtgagtg 


tgtcgatctc 


cagcgcagtg aggtcagccc 


1080 


ctgaagagaa 


ttcgcgggaa atgacagagt 


ctcagatgat 


gatccactcc aaggcctcca 


1140 


atggaaggta 


aaggcacttg ggacttcaca 


gcacagcaag 


ctgcgggatg ggccccgccc 


1200 


accgactggt 


cggctcccaa caaagatgcc 


ttccactgcc 


gccccaccgg ccaatgcact 


1260 


gagatccaga 


ccagatcgag gagacaaaaa 


agcaagttca 


acttcataaa tgaaatataa 


1320 


tgtatataaa 


ggaaggctct cataagtctc 


aatgtaaaaa 


gaaattcttt gtgaaattac 


1380 


tatttcttgt 


caatagtttg gcaaaagacg 


actaattgca 


ctgtatattg ccagtgtaaa 


1440 
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aatgttaata ctgtaatata tgaatatatt tcttaattta cacctctttc aatttc 



1496 



<210> 24 

<211> 1341 

<212> DNA 

<213> Mus musculus 



ggtggcagct tttctacaat gaaggctgga aagaccttgt gagaaatgag agacaagtga 60 
gattcctctg ctgcatgttt gcccacatgt ggttcctagc tcggtggcgg aggggcactg 120 
ggtcgtcatg tcaaaacggg ccagcttgct gccattaagg ttatggatgt cacaggggat 
gaagaggaag aaatcaaaca agaaattaac atgttgaaga aatattctca tcacaggaac 
attgctacat actacggtgc ttttatcaaa aagaaccctc ctggcatgga tgaccaactc 
tggttggtta tggagttctg tggtgctggc tctgtcactg acctgatcaa gaacacgaaa 
ggcaacacat tgaaagagga gtggattgca tacatctgca gggagatctt acggggcctg 
agtcacctgc accagcacaa agtgattcat cgagatatca aagggcagaa cgtcttgttg 
actgaaaatg cagaggttaa gctagtggat tttggagtga gtgcccagct tgaccgaact 
gtgggcagga ggaacacgtt catcgggact ccctactgga tggcaccaga agtcattgcc 
tgtgatgaga acccggatgc cacatatgat ttcaagagtg acttgtggtc tttgggaatc 
accgccatag agatggcaga aggtgccccc cccctctgtg acatgcatcc catgagagcc 
ctcttcctca tcccacggaa ccctgcacct cggctcaagt ctaagaagtg gtcaaaaaaa 
ttccagtcat ttatcgagag ctgcttggta aagaatcaca gccagcggcc agccacggag 
cagttgatga agcacccatt catacgagac caacctaatg agaggcaggt ccgcatccag 
ctgaaggacc acattgatcg aacaaagaag aagcgaggag aaaaagatga gactgagtat 
gaatacagcg gaagtgagga agaagaggaa gagaatgact ctggggaacc cagctccatt 
ttgaacctac caggggagtc aacactgcga agggacttcc tgagactgca gctggccaac 
aaggagcgct cagaggccct gcggcgccaa cagctggagc agcagcagcg ggagaatgaa 
gaacacaagc ggcagctact ggctgagcgc cagaagcgca tcgaagagca gaaggagcaa 
aggcggaggc tggaggagca acaaaggcga gaaaaagagc ttcggaaaca gcaggagcgg 
gaacagcgcc ggcactacga agaacagatg cgtcgggagg aggagaggag gcgtgccgaa 



180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
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catgagcagg aatataagcg c 1341 

<210> 25 

<211> 2368 

<212> DNA 

<213> Mus mus cuius 

<400> 25 



tcttactatt 


aagtatgggg 


atgattctgg 


tttttacaaa 


tgtgttagtc 


agattaaaga 


60 


aattgccttc 


tgttcctagg 


atttagagtg 


tttttattgt 


gaagcagttt 


tgaattttac 


12 0 


cagaggcttt 


tttaaaacct 


gattttgtgt 


atgcattcat 


gcccatgtct gtctgtctgt 


180 


ctgtctgtct 


gtctgtctct 


ccctctccct 


ctctctctct 


ctctctctct 


ctctctctct 


240 


ctcacttgta 


tatgtgtgtg 


catgtgtttt 


cttttttatt 


gatatggctt 


attgcattga 


30O 


ttgatttttc 


agttatttta 


tttatttgtt 


attttatgtg 


aatgaatatt 


ttgcatgcat 


360 


gtatgtctgt 


gcaccacctg 


tgttactggt 


ccctgaggag 


gccaaaagaa 


gtagtcaggt 


420 


ccctcacact 


agaattgcag 


aaggctgtga 


accgccatcc 


tgagcttagg 


gtttaaatga 


480 


ggtcttctgg 


aaaagcagct 


gaaccacctc 


tccagccctt 


gatttttagg tttgttttta 


540 


attattattt 


atttttattg gctattttat 


ttattttaat 


ttcaaatgtt 


atcccccttc 


600 


caagtttccc 


ctctaaaaac 


ctccctccac 


acccgcccta 


cctctttgag ggtgctccca 


660 


cacctaccca 


cccactccta 


cctcagtgcc 


ccagcatttc 


cccaccctag attcagctct 


720 


taaactagct 


ttgcattcct 


ctaataaatt 


ccatttcaac 


atgccttaat 


tatcttcatg 


780 


tgttgatgaa 


tttgttttgt 


taattttttg 


gtggtaattt 


ttctatgttg 


catataggtt 


840 


aaattgttct 


atcgtttctt 


tttcttaatg 


actttttttt 


tggtttgact 


ctgatagcac 


900 


agtgatatta 


gcttcataaa 


atgtgttgat 


gtatccccat 


ttctaatatt 


tggagcagct 


960 


tctgagcatt 


tttcaatttt 


tctttaaaca 


tttttgaaat 


ataacgatca 


aagatatttg 


1020 

i 


gtccaggatt 


ttctttgttg 


atagttttat 


tattattatt 


attattatta 


ttattattat 


1080 


agattcaggc 


tctttcctaa 


tgctttatta 


cctatttttt 


cctgatttag 


tttcagtgga 


1140 


aactagctta 


tttcaatatt 


tttatagtat 


ttattaagcc 


ttctcatctt 


tcatctggtt 


1200 


ctttctataa 


tgtgtcattt 


tctccatctt 


catttaaatc 


tttctatagc 


ttcaaacctc 


1260 


aaatgcttct 


tattggtatt 


atgactgatt 


atttttatgg acttatctaa 


cttgctgaac 


1320 


tatgtgtaga 


gtagcataag gatccagtta 


ctgtatttat 


ttcccgtacc 


ttttgtgtgt 


1380 
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atgtaggtgt ttgtctgaac gtgtggtctt tgtaccacct tgtgcagtgt ttacagtcac 
ctatgtatgg ttaaggttct cggggactag agttacagat agttgtgagc cactgcatgg 
ctgctagaaa ccaacttcgc tcctctggag gagcagccac tgattcaggc cctaaccagt 
atttctgttt tgttgtttta ttgccactta aaagtttgcc ccacttgtga tcttacattt 
ttcaatttaa aaaattacag cacagatagc tttcaattta taatcagttt gaaagtgttt 
caaatattat ttcataaaac attttatgat actgggtatc acaatcttac ccaaattaaa 
caaaatatgt ctttaggaaa agaagaaatt acaggccaat ttatcctaaa aaaaaaaaaa 
aaacaagaac agaacacttt ttaataaaat actgacaaaa ccaaaatcat cactatatgt 
aattaattaa tggcatgcag aagttttttt caaaggaatg gaaggttaga cctgaaagct 
ggtaaaaaga atgcattgca ttctaaattt tactcatcga tattgaataa aattcttaca 
ttttgttatg tttctatgaa gaaaaatctc ttaaaaatag aacgataaca ctaaagctat 
ttaaatgcaa tgttagggtt ttgttaatgt agagaaaaca attttcattt tcaaatgttt 
gggttatatg ataatgtgta gagttcttta gtgatgacat gatttttttt tttcttgagc 
tacaaaaagt cccaaagtag aaattaagtt taaatttttc taaatcctta attaattaaa 
aaaaaagttt catgctgagt gtggtggcca aaggcaggtg aatctctgtg cgtgtagggc 
agcctggtat tcattgcaag taccaggcca tccagtgcta catagtgagc ccctgtcaac 
caatgaagca gagacaaaga aacaaacc 



1440 

1500 

1560 

1620 

1680 

1740 

1800 

1860 

1920 

1980 

2040 

2100 

2160 

2220 

2280 

2340 

2368 



<210> 26 

<211> 1941 

<212> DNA 

<213> Mus mus cuius 

<400> 26 

aagctagcct tgaactcaga 
tgtgtgccac cactgcctga 
tttttttttc tgagacaggg 
agaccaggat ggcctaaaac 
aaaagggtgc gccaccacca 
ctttttatat agatggtcat 



agtctgcctg 
ccataaaatt 
tttctctgta 
tcaaaaatcc 
cctgactaga 
gttctgtata 



cctctgcctt 
actttttatg 
tagccctggc 
gcttgcctct 
atttttatat 
ggctatctag 



tcaagtgctg 
tagaattttt 
tgtcctggaa 
gcctcccagg 
atactttgta 
cagtcttggt 



gaaataaagg 
tttttttttt 
ctcactctgt 
tgctgggatt 
tatagaaata 
tcacaacctt 



60 
120 
180 
240 
300 
360 
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tcctgaattt ttagggcttt cttgttgtga tctaatgagc ttctgtgctt atttcagagt 420 

aacaatcttt gagggtagca acttgtaaat acagaatgtt tagcctagat tactgtaaaa 480 

attaaatgtt ggatgaattt tgaataggtt tgaacgaact atttacttat gaaggaaaca 540 

ttagttaatc tttaccactc ctgtttagct ttcactaata aagaaactcc tcaagtcctc 600 

aggtaatttt catgctcact gagtgctcag tacctgattt cactggtggt ctaagacttt 660 

taccctgagg agttatcata tcctcagtta atcaggaaac tgtgtcttaa gtatttgtaa 720 

tgttgatcat ctatttttca atttacccga tcattatcaa taaactgtta atgtacttga 780 

tgataaggtg tgattacttt atttcctata gtgtattctg tatctctgta cttcccagag 840 

tcagatcttc ccaattcatc ggttgtttat taagccctta actgttttta tagtgttagg 900 

ctatttgaat ttagatgatt taggtagttt gccataaagt atgggcatga tatccctgtt 960 

cttttgatcc tcctattgtt gatgtactct gtaagccttt gtctattgtc tatgtttgta 1020 

aataaggctg tattttaaat gtgtagtttt tttttctctc tccagaagga tctgcttttt 1080 

catttagctg aaagtgttta aaaatcatgt ctgtctgtaa agatgacaac agctcccagt 1140 

aacacagaag cctgtattgt gtgagctata acttggaaga atttcagata tacaatgtcg 1200 

tagtgatttt ctataacaat tttttattta aaagggagaa gaaactggct ttgtactctg 1260 

tgaatttcag ctttgtgttg tctatacatt gctcctagtg ccttggtaat gctgactatg 1320 

atgacatttt tgttacagtc ggcgaggctc tggcctgggc aagagaggag cagctgaggc 13 80 

ccggcggcaa gagaaaatgg cagaccctga aagcaaccag gagacagtaa attcctcagc 1440 

tgcccggaca gatgaagctc cccaaggagc tgcaggtata ctgactggca cttaaaacac 1500 

acatatattt ttgttcgttt cacaaattta tctttggatg aattttcttg ttctacatcc 1560 

taagtaggat gaaaggaggg gagagaatta agaggttacc atgaaacact tttattttag 1620 

gtcatagatt gggtgctctt gatttgtggg ctcatttgtt gttaagactt aaactctcaa 1680 

gcagtccata catgtactac tttcagaggg atttatagta aggtataaat tttccattta 1740 

aggtttttat atattgctta gagttgacta atgattgttt cattgaattt aaattcataa 1800 

taaaaaagta aagatgtatt tgaattgctt tctaagcatg tagatcttag cattttatac 1860 

gccctaaaaa tttgttttgt tctgaagcta ctttagtaat atttagattt ttatgggctt 1920 

atttgatatc ttggattgcc g 1941 
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<210> 27 

<211> 1940 

<212> DNA 

<213> Mus musculus 



aagc^agcct tgaactcaga agtctgcctg cctctgcctt tcaagtgctg gaaataaagg 60 
tgtgtgccac cactgcctga ccataaaatt actttttatg tagaattttt tttttttttt 120 
tttttttttc tgagacaggg tttctctgta tagccctggc tgtcctggaa ctcactctgt 
agaccaggat ggcctaaaac tcaaaaatcc gcttgcctct gcctcccagg tgctgggatt 
aaaagggtgc gccaccacca cctgactaga atttttatat atactttgta tatagaaata 
ctttttatat agatggtcat gttotgtata ggctatctag cagtcttggt tcacaacott 
tcctgaattt ttagggcttt cttgttgtga tctaatgagc ttctgtgctt atttcagagt 
aacaatcttt gagggtagca acttgtaaat acagaatgtt tagcctagat tactgtaaaa 
attaaatgtt ggatgaattt tgaataggtt tgaacgaact atttacttat gaaggaaaca 
ttagttaatc tttaccactc ctgtttagct ttcactaata aagaaactcc tcaagtcctc 
aggtaatttt catgctcact gagtgctcag tacctgattt cactggtggt ctaagacttt 
taccctgagg agttatcata tcctcagtta atcaggaaac tgtgtcttaa gtatttgtaa 
tgttgatcat ctatttttca atttacccga tcattatcaa taaactgtta atgtacttga 
tgataaggtg tgattacttt atttcctata gtgtattctg tatctctgta cttcccagag 
tcagatcttc ccaattcatc ggttgtttat taagccctta actgttttta tagtgttagg 
ctatttgaat ttagatgatt taggtagttt gccataaagt atgggcatga tatccctgtt 
cttttgatcc tcctattgtt gatgtactct gtaagccttt gtctattgtc tatgtttgta 
aataaggctg tattttaaat gtgtagtttt tttttctctc tccagaagga tctgcttttt 
catttagctg aaagtgttta aaaatcatgt ctgtctgtaa agatgacaac agctcccagt 
aacacagaag cctgtattgt gtgagctata acttggaaga atttcagata tacaatgtcg 
tagtgatttt ctataacaat tttttattta aaagggagaa gaaactggct ttgtactctg 
tgaatttcag ctttgtgttg tctatacatt gctcctagtg ccttggtaat gctgactatg 
atgacatttt tgttacagtc ggcgaggctc tggcctgggc aagagaggag cagctgaggc 
ccggcggcaa gagaaaatgg cagaccctga aagcaaccag gagacagtaa attcctcagc 



180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
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tgcccggaca 


gatgaagctc 


cccaaggagc 


tgcaggtata 


ctgactggca 


cttaaaacac 


1500 


acatacactt 


ttgttcgttt 


cacaaattta 


tctttggatg 


aattttcttg 


ttctacatcc 


1560 


taagt-aggat 


gaaaggaggg 


gagagaatta 


agaggttacc 


atgaaacact 


tttattttag 


1620 


gtcatagatt 


gggtgctctt 


gatttgtggg 


ctcatttgtt 


gttaagactt 


aaactctcaa 


1680 


gcagtccata 


catgtactac 


tttcagaggg 


atttatagta 


aggtataaat 


tttccattta 


1740 


aggtttttat 


atattgctta 


gagttgacta 


atgattgttt 


cattgaattt 


aaattcataa 


1800 


taaaaaagta 


aagatgtatt 


tgaattgctt 


tctaagcatg 


tagatcttag 


cattttatac 


1860 


gccctaaaaa 


tttgttttgt 


tctgaagcta 


cttagtaata 


tttagatttt 


tatgggctta 


1920 


tttgatatct 


tggattgccg 










1940 



<210> 28 

<211> 2935 

<212> DNA 

<213> Mus mus cuius 

<400> 28 



tgtatctctt 


tcatcaattt 


ttctctgtgg 


tatagcaaat 


gaccacaact 


caggagcttg 


60 


gaatagtatg 


tttattgttt 


gtgtgttttt 


atggatcaac 


tgggtctctt 


tattttttgt 


120 


tgttgttttg 


ctttgtttct 


tgtttttttg 


agacaggatt 


tctctgtgta 


gccctgaact 


180 


ccgtttgtag 


accaaacttt 


ctttgaactc 


agagaagtaa 


gcttgcttct 


tcctctcgag 


240 


tgctgggatc 


aaagacatgt 


gctgctacca 


cccagcttca 


gctgagtagg 


tctcttattc 


300 


aggcatcacc 


aggtggcctt 


acagtgttag 


ccaggctatg 


ttccccttct 


gagcttcaat 


360 


cttcttccaa 


gccctctctg 


tctgctggca 


gattcctcat 


ggtctgaaag 


aatgaagctc 


420 


cattttcttt 


ttcttttttt 


tttttaagat 


tttatttatt 


tattatatgt 


aagtacactg 


480 


tagctgtttt 


cagacactcc 


agaagacgga 


gtcagatctc 


gttacggatg 


gttgtgagcc 


540 


accatgtgct 


tgctgggatt 


tgaactccgg 


acctttggaa 


gagcggtcgg 


gtgctcttac 


600 


ccactgagcc 


atctcaccag 


cccaaagctc 


cattttctag 


ccggttgtta 


acagaccatt 


660 


cctagctact 


gtgttatgcc 


ctcctttggg 


agtgcatgtt 


agctgtttgt 


gtccttctat 


720 


gccagcaaga 


gcagcctctg 


ctttggaagt 


gctcacctat 


gtagaggcag 


aaggatcgag 


780 


gagttcaggg 


tcatcttcat 


ccacatagcc 


cgcttagaca 


acatgagacc 


ttgtgtcaaa 


840 
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agaaatgaac aataggagct ggcaagcttg 
ctggcagctt gagctcaatt cctaaaattc 
ttatcctcgt tctgcccaat aaataacaaa 
gttttttaaa atctttttaa aagcgctcat 
cactctgtgg gttggtagca ggactctgga 
acttacaaag aaccaaaatc taagccccag 
ttcttttctt ttcttttctt ttcttttctt 
cttttctttt tttttttttt ttttgagaca 
gaactcactc tgtagaccag gctggcctcg 
tgtttagtag ttcatagggg tttgtttgcg 
attttcctcc tgacccggcc tctgtttcat 
cactaatcca gttaggaatc ttctgtctgt 
aggaggtctg atctaagagc agattatttg 
atgcaagggt tgttgtccta gggggcattc 
tttcctctgt gcctctagag ctgaaaactg 
gttctatcat ggagctaaat ccccaaccct 
tgaagcctgt gatagctcac acctaacttc 
atgaggagct agtgtaatca cttagccaaa 
tggacagaga gccttcaggt aaggagtggg 
agctgggttc tggctgaggc tctgcttaag 
attcactaac aatataatat ttccatttat 
agaaatggca acggcagcaa tagagactca 
ttctgatgat gatgcagaga gctgccccat 
gggcacccca gagacctgtg cccattattt 
ggtgagttgg cttttcagtg ggctactgct 
gcaaaactga acacagggct gagtgctgcc 
ttactgtggg ggctgttact gggtgagaaa 



ctaagtaaat 
acatggtaag 
tgcttttttg 
gtggtaggat 
gaagggcagc 
tgaggacctg 
ttcttttctt 
gggtttctct 
aggatctgct 
aactgcagtg 
aatgccgaga 
cttccacaag 
ttgtcactag 
ttagaggcgt 
aacctagggc 
tggtggttta 
ttggatggtc 
tatctctgtc 
gactgtgtga 
ggtagtccct 
ccaaggttct 
agggaagctt 
ttgcctcaat 
ctgcctggat 
accccttatg 
tgggttctca 
ggatgactct 



gaggatgttt gccaccaagc 
agagaaacaa tttccataac 
tttgtttctt tgtttgtttg 
tgatcagcga gggtaagttc 
ctcgcacctt tgttagtcct 
cttttctgtt ttttcttctt 
ttcctttctt tctttctttt 
gtgtatccct ggctatcctg 
tttctgatcc ctgtgatata 
tttgctaaat gcatcttcat 
tcgactagct cttttatagt 
ggaggaagag atgcccaaga 
gttgaaagag ttgtgcagct 
agtcattatt ttgttgagtg 
tttgtacttg ctaagcaagt 
gtcattagca ccatcttctc 
tttttctcag aggctcaggt 
tccgtagctc agggcagctc 
gggagacttg gctcaggagg 
gaacatgcac ttgtttctgt 
gaagactcag aagatggagt 
gaagctagca gtgtacccaa 
gcatttagag accaggctgt 
tgcatcatcg aatggtccag 
actaggctgc cttctctgca 
gcattgtggg gaaggaatgc 
ggcttttctg agggtcagga 



900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
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cacaaactct 


tcggcaacct 


cacatagtaa 


gtgtaaggca 


gcctctgtag 


gtattggaat 


2520 


ggtgcaagtg tgcttactca 


gttctgatgg 


caagtcatga 


agagacaagc 


ttttatcacg 


2580 


ctgcttacct 


tgcagtgaac 


tttaaggaga 


cttgcctggg 


ccttcaccca 


ctggcaaaga 


2640 


cttttatatg tgtacaggta 


tgtggtgtgt 


gcctgtcttt 


acacgtagag 


gctaaaggat 


2700 


gtcaggtatc 


ctgctcfcgtc 


gctttctgcg 


ttattccttt 


gagtcaggat 


cttttactga 


2760 


agctgcagca 


taataggcag 


ctagtaaacc 


caagacattc 


tctcattctt 


catggcatta 


2820 


gcattgagag atgaaggtat 


aatcttgccc 


aggtttttat 


gtggatgttg 


agatatgagt 


2880 


ttgggtcctt gtgtttgcat 


aacaaatttt 


cttatctgtt 


gagccatccc 


catcc 


2935 


<210> 29 

<211> 4090 

<212> DNA 

<213> Mus musculus 












<400> 29 
tacccaaata 


caagtaccag 


ggtggggagc 


tacaaggcct 


ccctgtgtga 


tctgggacat 


60 


gaaggggcaa 


gggcaagggg 


attcctgaag 


ggaaggggtg 


gcattcaaat 


tctgaaagcg 


120 


tggaagatct 


tattagcaga 


cagaggtctg 


ggccgggcga 


ggcagctggc 


aggggttctg 


180 


ccatcattgc 


tggtccaggt 


aggcagacaa 


cagaagctct 


ggaggcagga 


aatccaggtg 


240 


gcggttgctg 


actttcatct 


gcctccggcc 


ctccaggtca 


gcactggctg 


ttgtccagca 


300 


tagcggtgta 


cacagcttgg 


ggggacaccg 


agaagaacgc 


cagcagccgt 


tcctcccaac 


360 


tctccagggt 


ttccattggg 


atgcggctgc 


gcttgagcac 


ggaccgcagg 


cgacggctga 


420 


tgcgcctgaa 


gcactcgtgg 


ggtgtgaaga 


cagggtcctc 


tctccgtcgg 


tcctccccac 


480 


ggccaggccc 


acgtctccgc 


ttgccctggc 


tctcatcctc 


caagtaacgg 


agcactcgct 


540 


cttgctcctc 


tcccgagcgg 


ttcataaaat 


cattccagac 


ctccacatag 


gtggcattgc 


600 


tgcaggcctc 


tgcgaagatg 


ccaggtgcag 


caggagtcag 


gagggccccc 


ttccagtaag 


660 


actggagaga 


gtgtgtcagt 


gccgtcgtcc 


atggagagtg 


aggaagccgc 


agaagtcaca 


720 


gatgacccaa 


tggagcaaga 


ctaactactc 


agaaacactt 


agaggcagta 


ttttacatac 


780 


agttctggtt 


ttacactgta 


taagactttt 


aagtaataaa 


gtggaccttt 


agttttacaa 


840 


gagaaacagg 


ctgtaaaata 


aagaacctta 


agaataaacc 


ctgaaggttg 


tatgtggaag 


900 


agctgtgagt 


acggctcctc 


tgggtcccgc 


ttagtgctga 


ttttcttgtt 


tggtttggtt 


960 
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ttttgtttgc 
aatttagaat 
atctcatttg 
attttggtca 
atttttattg 
tgagggggga 
attggtaatg 
aaaaatctgg 
cacattccaa 
aacaagactt 
agcttgctgt 
caggtcccat 
aataatccta 
gaggacttca 
accatgattt 
ccagtgttat 
attttcaaat 
ctgccaaatt 
ctgtaaagta 
ataatttcgt 
atattaatgt 
tttcagacat 
ttatctatcc 
tagacgttac 
tgactgatga 
aactgtcgat 
aaaaagatta 



ttttttcctg 
taaagttttt 
gtttatacat 
tagtttaaac 
ttcgactgta 
tgttatattt 
actgagttca 
cccctgtgag 
gagatctgtg 
catttattta 
ctgtagaatg 
aacattcggt 
tatgccgcat 
gtaatgtacc 
gtataacatt 
atttttcaga 
tcacatattt 
tatacctgtt 
ggttttgtag 
gtgtaactga 
tatgaaatat 
ctgaggtaca 
taagcttcgg 
tatggctaaa 
ccttcaggtt 
ttactgtacc 
aaatgaagca 



gtgtactcaa 
aaactggaag 
acacaaggaa 
atgttaacat 
caatggaacc 
ttctgtttct 
cttctttcag 
accctgggaa 
caaactaaat 
tggaaagtgt 
agtgcttcgg 
tgcttttctg 
ggcactggtg 
tgagctaaaa 
ttgaagtgaa 
tcaacacaaa 
aaagtcatgc 
tcttcagctg 
actgtaatgt 
atgcttgggc 
tggagtacat 
gggatggacc 
gtgatcagag 
aagatatttg 
cacagcagcfc 
caaaggggtg 
aaatttaagt 



atcagtggtt 
ataattataa 
gatttttgtc 
gtgaattaat 
ttaagtcata 
ataagagatg 
aagacatctt 
tctttcagtc 
tcttttgtat 
gctttatctt 
agcctgacat 
aacactgtca 
tgaaacatca 
tgaccgaggc 
ttaatatttt 
gcacaatggt 
aagctgcaac 
tactttttga 
gttcactgcc 
tttcaataca 
ttttatcaaa 
taatagctga 
ggaaaccaca 
gctccgtttg 
ggacagtaga 
aaagtcaaat 
gtgacattca 



tgtgtataga 
ttttgaaaac 
ttgtctctag 
agggtttcat 
tatacatata 
aatacagtgg 
ctcttctgag 
tgttgaaata 
acttctaagg 
gggagttgtg 
gaaaccatct 
acatcacatc 
ctgtactgga 
gttaggggtg 
tgaacatgct 
tactactcta 
ttccctgtca 
tatttagaat 
tttgtgaagc 
gtattcatat 
atacaaaatc 
tacaaacagt 
aggtttgcat 
ttcataaagt 
tttatgaatc 
gtaacttcaa 
tttgtaatgg 



tttttttttt 
tttaaagatg 
cagtttccat 
gtggtttcag 
tagattatcc 
atactttttt 
tagttgagac 
ccaggttaaa 
tgcctgagac 
ctcaagcatt 
caagagccca 
tgtctttctg 
agaattagta 
cacagaaacc 
tcttcaacag 
taaactcaat 
gaattactgg 
ttttaaattt 
ggtatattgt 
aaagcaataa 
tcttttttag 
ttctcacact 
tttgactgct 
aatatgctac 
tgtctagtaa 
gttttttggc 
cctgttagag 



1020 

1080 

1140 . 

1200 

1260 

1320 

1380 

1440 

1500 

1560 

1620 

1680 

1740 

1800 

1860 

1920 

1980 

2040 

2100 

2160 

2220 

2280 

2340 

2400 

2460 

2520 

2580 
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ttgagggatt 


gacagtaaga 


gcagatttta 


aacattttag gtttagagtt ttfcgagattt 


2640 


tccttaggat 


atttatgagg 


ttgttgtcaa 


taaacctgtt 


ctctaagctc 


ctgtgacctt 


2700 


tgatagtctt 


tatttatggt 


gccatggacc 


atttacaaac 


aagtcagttt 


gctgttggtt 


2760 


aaaagttgaa 


gatttcgcat 


catgaataga 


ggtctgtggc tttatttgta aacttgcaat 


2820 


tgctatcttt 


gcaaggggaa 


gtgtatttct 


ttattaaata 


aagtacaatt 


aataatggtg 


2880 


aatgtaccaa 


aatgacatca 


ctcaattcta 


tgagaggtct 


gcattttaac 


ctatagttta 


2940 


atagctttaa 


tatttattag 


ctattcctat 


gttgatcata gatgaaagtt gttgctgttt 


3000 


atacagatac 


acgtaggata 


ctggtgaagg 


gtctctaggg 


cagttgtaat 


attcatcacc 


3060 


gtgtgaggcc 


atgttcagac 


tgtgqatqca 


ttggtccctt 


ggaagtccat 


ttccacacgt 


3120 


ggtgttagat 


gcacttacag 


gagactgcag 


gaaagtgtca gttctaatga gcactgagtc 


3180' 


ctttgtgagt 


gacagaatga 


gacccaagag 


ggagggtcaa 


gccgtcccgt 


ggctgaagta 


3240 


gctggctctt 


tgatgttgaa 


cagattccta 


accgggttct 


cctttcccaa 


gcctaactca 


3300 


gatctgggca 


gtgctgatgg 


tgctgacaga 


atacacatga 


catggttttg 


ccacccctcc 


3360 


cttttaaaaa 


gtgaaaacat 


tttgaaaact 


ctataaagtt 


ctgtacatgt 


agaacagaag 


3420 


tgagtagtga 


aaatatattt 


tgaggattag 


taaacttaat 


ccacttaatt 


gtcacaactc 


3480 


cggtctttcc 


catatgtagc 


cagagcaatg 


gagttacaac 


tctggctttc 


gaaagctatt 


3540 


ccagaaaccc 


tgccccagaa 


agtcttaaag 


cattgagatc 


cttgtgtttt 


attttggcag 


3600 


tgtagatagg 


catgtattta 


tgcatttgta 


aaatcaattt 


ttttcaaata 


atgtatgtaa 


3660 


tgtactagct 


taaacggtac 


tgggcagagc 


ctagagctac 


tgcgaggatt 


gaatgtgaag 


3720 


ccggtatcgc 


gggtggaaat 


gtacctgcag 


agctacagca 


aactattccg ggtagtgttt 


3780 


caggctgcct 


ttgagcagga 


gtttcttaga 


tctattggtt 


ttgacaaact 


gaagatcagt 


3840 


tcttgagatt 


tgtgttaatc 


atgagatgaa 


tggatgcaaa aaaccccttg taatttcatg 


3900 


tggaattatg 


aaaattagct 


tgatgggata 


ttggcctaac 


aaggatgatg gtatgtactg 


3960 


gctagaatac 


atatatttca 


catataaaaa 


ataatccgga caccagaatt 


ctcttctttc 


4020 


aatttggagc 


ctagatcgat 


cactttgcca 


aataaatgta 


ttattttcat 


aatgcaataa 


4080 


agtgtaaact 












4090 
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<210> 30 

<211> 3272 

<212> DNA 

<213> Mus musculus 



tttttttttt ttttttttgt ttcaagacac agtggcccag gttggcctag aactcactat 60 
gtagcctatg ctggcctcaa attcacaaca ctcttcttgc ttgtctcctg aatgctggga 12 0 
ttgcagggct tgttcatcat gtctggttta tgtagtactg gaaatagaag gcagggttcc 
atgcacacta ggcaggtgct ctgctagagt agtggatcaa tcgcagtgac tgcatctggg 
gagtagtgga tcaatcgcag tgactgcatc tggggagtag tggatcaatc gcagtgactg 
catctggaga gtagtggatc aatcgcagtg actgcatctg gggtacactg aagctgtctg 
gggacatttc cagttaacac cactgagaat gatcaggctc cagcaggtga ggcaagggac 
gctgataagc atcttgatgt gcaaagatgt tgcttacaat gagctagcat ctggcccaaa 
tgtgggaaag tacaaaggct taaagtagac tgaagtctcc acgtgcaggg gatctgtgaa 
gctttgctgc ctgctgccag ttggtgctaa ttacttgtgt gtcactgtaa ggggaacctt 
gagggaaatt gtgtgggaag aaagctgttc tttggctctt catttcagag ggcctcaatc 
cagtcctgca ttgaagttag cctgagtgcc ccctgctgtg acctcatact ccaggcacct 
gggactgggg tggtcagatc agggataaga tggccagttc tgtctgattt tgactttcag 
aaaagtgaat gaagagttat ctagtgcagg tgtgtcccac accagtgaag gacatctgtg 
ctgttactcg gtaggatcta gcagtcgcat cccaaagcaa ttaagttact aaccagagta 
ctgggcagtc acctgaaaca cattcttatc cttagaaatc ttccaagcga accccccaga 
caactctggt tactgatagc tgtpcataga ctgataggca gtctcccttg ctgaagacaa 
cacctacata actcactgaa cacggagagg tcatgctggt gcctttggcc ctgcagacta 
gtgtccgtgg ttctgggagg tactctgcat gctatcagag gagaaaggtg aacaccaacc 
cagatacaaa tctttttatc tgcaatggtg accttcctgc atgagatgct ggggcaatgg 
cggcactaag gttgtagaag taacccacac tctctggatt taaagaactg catgggatag 
agcccatgcc tcacgctgac ttggtggcca ggaacctgag actacataaa ccatgaccta 
gggggaaact attgttctgc tcaagggcta tagcaataca acaattccca atgtcactct 
gctgtactca cagatcagca ccttgctcag ccatcatcag agaagcttcc ctctttagta 



180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
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gatgggaata aatacagaga ctcaacaact ggacattgtg cagatagtga aagacattgg 1500 

aacactcagc cctaaatggt aacaggaact atgcagatga ggaggtggag ccagagggga 1560 

tggagggctc caagggaaca gtgccttcca ggcccaacag aactgatgca catatgagct 1620 

gacagactgt ggcagcgcac agggcctatg caggtccaaa ccagatggtt ctcagtactg 1680 

aggtggggga aggagacata agctctcatc cctaacccaa agctataact aacaactgcc 1740 

tatgaaggaa aaaattaatt tctccaagag tatcttactg gatatacaaa tacaattaag 1800 

ggcaggtccc atgctcagga aaacatggtg aacaccacaa gtaaactctg tgtgtgtgtg 1860 

tgtgtgtgtg tgtgtgtgtg tgtgtgtgag agagagagag agagagagag agagagagag 1920 

agagaagtca tggatagccc tggtgttgtt tgtattttga tgctaattcc gattccctaa 1980 

gaggggctgc ctaggaggag gagtgaatca cctactcagg tgacttcatg tgaccattct 2040 

aatgaataaa gatttcaagg aaaccattcc ctggacgggt aggcgggtct tccaggtcag 2100 

acgggcatgg cgggggtggg gtggggcaaa agaggagagg agagtgggct ggaggacagg 2160 

agcataaaga cgaaaatgta ggtagtgaat ccccgctccc ccaccccctg tttcaggtgg 2220 

cagatctgtt tgggtcagct accagaggat ttaacttaga atggctgata aattaggatg 2280 

ttaattgttg tgcccagtga ttgagttacc gttgattctg aactaagttt gtgtggtgtt 2340 

ttctttcact tggcggctca actgggttcc ggagagaaaa ggtacagtga tgtggaatcc 2400 

cagccagcca caggaatttg gaagtgtgga gctggcatgg cagcttaaca agcagggtgg 2460 

agagctccag gagcagagag tctgctgaga agaacaaggc ccgccagtgc cttgctggca 2520 

atagcatgga tagtttcttt ttacatttcc tgctgtgtgt atgtgtgtgt gtctatgtga 2580 

gtgtctctgt ctgtgtctat gcatgtgtct gtatgtcttg tgctcttttt tagtttcttt 2640 

tttgtttatg tgtcttcgtt ttgttttaac cttgaatgct tgtctttttt atatgcctat 2700 

ttttttttaa gagagaaaga aggtgtgggg ttgaaagggt ggagaggtgg aaaggatctg 2760 

ggaggagaca agggaaggga aaaccatggc cagaatatat cacatgaaag taactttatt 2820 

ttcaattaaa aaaagaaatt tcccattatg gttatgaggt agcaatgaac actacggttg 2880 

ggggtcagca catgaggaac tttgttaaaa gactgcagca ttagaagggt tgagaaccac 2940 

tgtcctatga acttctggct gtcctcatgt tcctccaccc tgagaaatcg caatactgct 3000 

gtatttactg tcgcctgcaa accctcccta agggttggtg agatggctca gtgagtgggt 3060 
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tcttctatct tccacccttc ttcactcctt ccccaccctt tcaacctcca acactaaata 3120 

ggaaagaaaa agaatagaga 9 gaaaagggg gccaccttat tgctagacta cttcctgctg 3180 

attaaaggtg tccagttcct tgggtacctc tgatctttgt catcaggata tctattttcc 3240 
tgttgttgtt tcttttttgc tcaagactac tc 



3272 



<210> 31 

<211> 3821 

<212> DNA 

<213> Mus mus cuius 



gc^tggaa accagagact ccgagggagg cgaccaggct gcggaggaga gggccggctc 60 
acaaagtgct gctttgacac atccttagga tggaagttaa gtgaaaacag aaccacacaa 120 
aacaaaactc cgcgaagtgg tgctgctacg gaggaaacca aggggagaaa aacccggtgg 
gcaggtcaat ggttgcttcg cagcgctttg gcaagtttgt ggaacacttt ctaggaatta 
ggtctttttt gtacccccat catcttcttg acttccgaag aaagaagttg tgtttggatt 
gcaatggagt ctaaggagac agggctagac gcacgtgaat agtcccgcca gctgggctga 
atttgtggga atttagaaag acagcctgtg gaagtgcaac gtctctgaag tccccctggg 
ttcattcgga tggcacctaa cgcgtcccgt gacagacctc ttcaccaaca gcttccgatg 
ttgccatttt gctcttcttg accttaatta atctctagga aagtctaaac ttcggaccta 
cctctttttt tgatacttat tttttgtact tctgotctct gggattggtt tcttaaacaa 
cctggatcct ttttcatatg tcaaaatgaa tcctctgatg tttacactat tattgctctt 
tggatttctc tgcattcaga ttgatggatc tcgtcttcgt caagaagact ttccccccag 
gatcgtagaa cacccttctg atgtcatcgt ctccaaggga gagcccacca ctctgaactg 
taaagcagag ggccgaccca cccccaccat tgaatggtac aaggatggtg agagggtgga 
gacagacaag gatgatccca ggtcccacag aatgcttctg cccagcggat ctttattctt 
tttgcgaatt gttcatgggc gcagaagtaa accggacgaa gggagttacg tttgtgttgc 
aaggaactat cttggtgaag cagtgagtcg aaatgcatct ctggaagtgg cattattgcg 
agatgacttc cggcaaaacc ccacagatgt ggtagtcgca gctggagagc ctgcaatctt 
ggagtgccag ccaccacggg gacacccaga accaaccatc tactggaaaa aggacaaagt 
ccgaattgat gacaaggaag agagaataag tatccgtggt gggaagctga tgatctctaa 



180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
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tactaggaaa 


agcgatgctg 


gcatgtacac 


ctgtgtggga 


accaatatgg 


tgggagaaag 


1260 


ggacagcgac 


cctgcagagc 


tcactgtctt 


tgaacgaccc 


acatttctca 


ggaggccaat 


1320 


taaccaggtg 


gtgctagagg 


aagaagctgt 


agaattccgt 


tgtcaggtcc 


aaggagatcc 


1380 


ccagccaacg 


gtgaggtgga 


aaaaagatga 


tgcagacttg 


ccgagaggaa 


ggtatgatat 


1440 


caaagatgac 


tacacgctga 


gaattaaaaa 


ggccatgagt 


actgatgaag 


gtacctatgt 


1500 


gtgtattgct 


gagaatcggg 


tgggaaaagt 


ggaagcctct 


gctaccctca 


ctgtccgagt 


1560 


tcgccctgtt 


gctcctccac 


agtttgtggt 


taggccaaga 


gatcagatcg 


ttgctcaagg 


1620 


ccgaacagtg 


acattcccct 


gtgaaactaa 


aggaaaccca 


cagccagctg 


ttttttggca 


1680 


gaaagaaggc 


agccagaacc 


tacttttccc 


gaatcaacct 


cagcagccca 


acagccgatg 


1740 


ttcagtgtcg 


cccacggggg 


acctcaccat 


caccaacatc 


cagcgttcag 


atgcgggtta 


1800 


ctacatctgc 


caggccctaa 


ccgtggcagg 


aagcatttta 


gctaaagcac 


agttggaagt 


1860 


tactgacgtt 


ttgacagata 


gacctccacc 


cataatcttg 


caaggaccaa 


taaaccaaac 


1920 


acttgcagta 


gacggtacag 


cattgttgaa 


gtgtaaagcc 


actggtgagc 


ctctgcctgt 


1980 


aattagctgg 


ctaaaggagg 


gctttacttt 


tctggggaga 


gatccaagag 


ccacgatcca 


2040 


agaccaagga 


acactgcaga 


ttaagaattt 


acggatatct 


gatactggca 


cttatacttg 


2100 


tgtggctaca 


agttccagtg 


gagagacttc 


ctggagtgca 


gtgctggatg 


taacagaatc 


2160 


tggagcaaca 


atcagtaaaa 


attatgatat 


gaatgacctc 


ccgggaccac 


catccaaacc 


2220 


tcaggtcact 


gatgtttcta 


agaacagtgt 


caccttatcc 


tggcagccag 


gtacacctgg 


2280 


cgttcttcct 


gcaagcgcgt 


atatcattga 


ggctttcagc 


caatcggtga 


gcaatagctg 


2340 


gcagacagtg 


gcaaaccatg 


ttaagacaac 


tctgtataca 


gtaagggggc 


tgaggccaac 


2400 


acaatctact 


tgtttatggt 


. cagagcgatc 


aacccacaag 


gtctcagtga 


tccaagtcct 


2460 


atgtcggatc 


ctgtacgcac 


acaagatatc 


agccccccag 


cacaaggagt 


ggaccacaga 


2520 


caggtgcaga 


aggaattagg 


tgatgtcgtt 


gttcgtctcc 


ataatccagt 


tgtcctgaca 


2580 


cctacaactg 


ttcaagtcac 


atggacggtg 


gaccgacaac 


cccagtttat 


tcagggctac 


2640 


agagtgatgt 


accgtcagac 


ttcgggacta 


caagcctcaa 


ctgtgtggca 


gaatctagac 


2700 


gccaaagtcc 


cgactgagag 


gagtgctgtc 


cttgtgaatt 


tgaaaaaggg 


ggtgacttat 


2760 


gaaattaaag 


tccggccgta 


ttttaacgag 


ttccaaggaa 


tggacagtga 


atcgaaaaca 


2820 
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gtccgaacca 
ggaagtcaca 
aatggaatta 
aataaaacgg 
cagtaccggg 
cagccgataa 
actgagcaaa 
tgctgggtaa 
ggactcagca 
agccgtccca 
gccagccacg 
tgggcgtgga 
ctcggatgga 
cagccaaata 
ccacgaactg 
agacctcatg 
aggtggaaaa 



ctgaggaagc 
acagcacaag 
ttcaggaata 
tggatgcagc 
tagaagtggc 
taattggggg 
tcacggatgt 
ttctgatggg 
attatgctgt 
ggtcttctaa 
agtttgccag 
gatgtgctgc 
gccatttata 
acacaggcca 
gcagttgatc 
ggatttggtt 
aagaagaaaa 



cccaagtgcc 
catcagtgtt 
taagatctgg 
cattcgctct 
agctagcaca 
acgtaacgaa 
cgtgaagcaa 
ttttagcatc 
aacatttcaa 
atgctggcga 
tgaacaatag 
ctccggtgcc 
gcagcattga 
ccccatatgc 
ttcctgatcc 
attcgctacc 
ctaaaaattc 



cctccccagt 
tcctgggatc 
tgtctgggaa 
gtagtaatag 
agtgcagggg 
gttgtcatta 
ccggcattta 
tggttgtact 
agaggagatg 
tcccaattac 
caatagtggc 
aggccaaggg 
cttcactacc 
cactacacaa 
acagtggaaa 
tgatcagaac 
ttcgaaagcg 



ctgtcactgt gctgacagtt 
ctccaccagc cgaccaccag 
acgaaacgcg attccatatc 
gtggcttgtt ccctggaatt 
ttggagtaaa aagtgaacca 
ctgaaaacaa taacagcatc 
tagctggcat tggtggtgcc 
ggagaagaaa gaagagaaag 
gaggactaat gagcaatggg 
ccatggcttg ctgattcttg 
ccaaatgaaa ttggaaattt 
gataaaacag cgaccatgct 
aaaaccactt acaacagttc 
atcctgcatt caaacagcat 
agctcagttc aacagaagac 
aaggggaaca acggtgggaa 



<210> 32 

<211> 1490 

<212> DNA 

<213> Mus musculus 

<400> 32 

tgaagaaaat gaagacggga gaaaaacgaa gctggccatc 
tgagtaatca gtcagttaag ataatagtta gagagttcta 
cgactatgag taggatgagt ggatactaaa tgtcccttgc 
cctacctaga gcctgctgtg gagttagaac ccagaactcc 
tactacgatt ggaaccttct tggttccaat gatagttctg 
gaatcgtgcc cagtgtttgc tgggtgtgag ggtcttcggc 



2880 

2940 

3000 

3060 

3120 

3180 

3240 

3300 

3360 

3420 

3480 

3540 

3600 

3660 

3720 

3780 

3821 



tcatatagag 
gaaactggtt 
tcccatccca 
attcaggtga 
gaaagcaaac 
agtggggacc 



cagtggactt 
caaaatggtt 
ccatcccaat 
cagctaagtc 
aatgaaaaga 
agatggtgag 



60 
120 
180 
240 
300 
360 
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gaacggaagc aggcttctgc atcgccaggg ttcaggttcc ttctcctggt ttgagttctg 420 , 

tttttttttt ttttttaaat gtgaagagat tttctttgtc atttctaaaa ctcctgtcag 480 

ttgctggatg tcctaaagct gttgaatatg gacgtaactg taaatcccag agtgttttat 540 

tttgagatga gagttttgct acagtttata caggatattt tcctatttag acctcagggc 600 

tatcttggga cccactacta agtgtgaccc ctccccgcag cttcaattct gggtagatga 660 

gttacataac ttttaaatgt gatatccagc aggaagaatg tatttctctc ttttaacttc 720 

gggcaaattt tgttttaaag gtctagaaaa aagacagtag aagaaaacag aggtaaaaga 780 

attaaaaacc aactgtaaaa taaacattct catggtatac aattgtgcta cctgaataaa 840 

cttatgtgca taaattattt aaaagtgcta tgaaacatat ggtattttcc tgtcatttgt 900 

ttgggttggt gtggttttat tcatttgatt tccacatatt tgcactttat ttttcaaagc 960 

taagggccct tctagtcgtg atccacccct tccaggaggg gagcacccct ggacaaaatg 1020 

tacctctgtt tccctgtgta tctctttatg agtggcacgc catatgcgct ttcatccagg 1080 

gcttttcttc ccccatcaat taaaatgtcc ttgagattta aaaataattg gaaatatatt 1140 

tttatattta ttgtgcgtgt gtgtgtgtgt gtgtgtgtgt gtgtgagaga gagagagaga 1200 

gagagagaga gagagagaga gagagaacat gctggtgtgt tagtgtagtg ggttaaagga 1260 

cggctgtaga agctggcccc ctccttccac cacatataca ctagggatca aactcaagtc 1320 

gtcaggctta gcagcaagcg ccttgaccac agagtcatgt caccagtcta aagatgtagt 1380 

tcaggttgac ctcaaagttg tgatcctcct gcctctgcct cttgagcata tccttcttgc 1440 

atgcaccacc acaattgact taaaatattt taaaaatcag ttttaatggc 1490 

<210> 33 

<211> 2185 

<212> DNA 

<213> Mus musculus 

<400> 33 

ggctgctgct gctgctgctg ctgctgctgg agcaaatgaa gaactctttt tcttaagcag 60 

ataaccagct tctggcagtt gcatgatctt gctattgaag tggaccttgg taaaaagtgc 120 

tggtatcact ccatatttgc ctgtcccatt cttcgtcagc aaacaacaga taacaatcca 180 

cccatgaaat tggtttgtgg tcatattata tcaagagacg ccctgaataa aatgtttaat 240 

ggtagcaaat taaaatgtcc atattgccca atggaacaga gtccaggaga tgccaaacag 300 
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atatttttct gaagaatgag tttgtttgca atttgtaagt gaaactgaat tatgggtaca 360 
ttcaagacaa gagtgttcca ttgactgcag ctatccaagg accgcctgtt cataagctat 420 
gctccagagg ctgccgattc actcgtgtgc acggaggggg tgctccagat gggaatcaca 
cagggctttc ttcactcttg gtcttcgttt ctgatcaagt aaacaccagc agttgtcatt 
cagtgcaggt ttttgtactt ctatatggtg atttttttac ttaaaagcag aaacagaagt 
tgaccttcct gacatgtgtt taatattcct cctgctttta cagattctga cgttttcttg 
ataattgtaa gcttgagagt gtttgtggaa gaacttactt tcttcttatg tatacataat 
taaatgaaaa gtcttcatag gagtttgaca aaatgaattg tggttataaa acgaatttgc 
ttttttgttg ttttgttttt cttttttggc ctaaggaaga aagctgtgat aaatttcaaa 
tttgcatagc ttttcaatgt tttgctctgc tcccgctctt gcttcagagt cggcacactc 
acctgcattt gagttctgtc tatagcccag caggctgcct gtttaaaatc ccatcatgaa 
tttaacaggc tgatgtgaga atgaaataga tattactggg gttttgttgt tgtttgttta 
tttgttttgc ttttattacc aagaggtgct ttttaataaa tggatattga agttagggtg 
ttactaattt gatgtatggt ttcacagtct agcatactgt cctttgacat ctgcctttaa 
gacttggctg agtgtctcat tagtttatca tcacagatac gcagtgttat gcatgtgtat 
agaagtgtgt gcaccagcat caaacattgt gtgtgtggaa gggaagaagc ctgtccattc 
taaacgcagt tgccagtctc atcacttcag gtccttacgg gcaggctcta gcaactttcc 
gtgtatggac ctcgtttttt gctgttttgt gtttaattag tatattgttc atgcctctct 
tctgcagtgt ctcatctcat agactgtgaa cctgtatatt attcaaatgg ctacagataa 
tgctcttttc ttttgtgagg tctcttcatt taatgcactg cccagaaaga gccatgtgta 
agagttgttc tctgtttgag gaactaacta catggaaaag acttctgact taaacccatg 
aaatacttca tcttgagaag agtgctatgt ggaaatcacc aaatatctcg caactttatt 
tcatctggtg taaatctgaa catcaacata ggaaaactgt catgagaaaa tgaaaaagca 
taaacacaga agcaacgaga aatgtgactc ttgttatttt aaaccacaga cggacttggg 
ttaagggaat ggggacgaca gctttggtgc taagttaatc agaaattgcg agcatgcaca 
gtggtatgcc agcctgggtg atgctttcct aggagagccg gtatttgctt gtaagggaaa 
gaatggtatt gtagaaaaac ccaagaaatg accacgtggt cagtttcatg gtgatggcta 



480 
540 . 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 



59/186 



WO 2005/005597 



PCT/US2003/027106 



ctgtgtactg tgtaggttac ctgggtcaag actggtccag cttagagtgc atgctctgtt 1980 

catactcagc ctagctcaca gtgacttgtg cttcactgtc aggatgctct gtaaagttcc 2040 

tcatccctgt gaagcaggga aagaacagag gtgcctctgg actgcaggag aaaggcgtcg 2100 

tgccaggcag tatcctgtgg agggaggagg gtgtacgttc gtttactatg gatagttttt 2160 

cctttgaatt aaattgtagc gtgtc 2185 

<210> 34 

<211> 3598 

<212> DNA 

<213> Mus musculus 

<400> 34 



aaagagagga 


aagatttaaa 


atagccatat 


taggttatct 


ataatggggt 


caccatccac 


60 


ctcagaaaat 


gggcagtgct 


cctactgagg gctatcaatg 


aaagacctgt 


gctgtgctgt 


120 


gctttgcact 


gtctatttga 


aagtctcaga 


aggtatggtt 


taccttatcc 


cagccaccgt 


180 


aattaagtga 


atgcttttag 


actgtgaaag 


gatagttgcc 


atttggccta 


agagcactta 


240 


ggcagagcgc 


ttcattcccg 


tgtgatgctg 


cataccgttg 


tttttattta 


caatcccaaa 


300 


cgctctgtgc 


cttggttttt 


cactcgccaa 


aaccaacctt 


actctactaa 


atgaaatgca 


360 


atcttaccag 


tttaaccagt 


atgctgtgat 


attgtagtga 


gtctcagaga 


tgactggaaa 


420 


caggaggcct gtcatttcag 


tgagcaccat 


cacctaagcg 


gtaatcattt 


ctcctgtgtc 


480 


ttaaactgct 


ctgactcctg 


agcaagtgtt 


catgtctgtg 


tgtcaaaata 


aaaagttttg 


540 


tgtgaaggac 


tgtctttgct 


ttcctccatg 


gtttttactg 


tacatttccc 


tgtcgtcatg 


600 


aagtggggtg 


cagagactca 


ccttttatta 


aagtagctgt 


gtgaagtaag 


ctttcggtgt 


660 


atccctaagt 


ctgttagcat 


gtactccttt 


gtaatatctt 


gagtacggtg 


atctttatgt 


720 


acgttttact 


taccaactca 


caaatactgt 


agcaaatgaa 


tgtgaacatt 


tactttctga 


780 


aaagccagac 


aattttgttt 


tcaattatag 


tactgagcaa 


ttaagcattt 


agataatctt 


840 


ttaataacca aagctggtcc 


cattctggtt 


ttgccttttg 


cttacttgtt 


gcttcaatgt 


900 


ttttagagca gatgtttttg gttttttggt 


ttttttgttt 


tttctaatat 


atgtggcttc 


960 


attgttttaa 


gttggtgtgg 


gtatctaact 


tacaaagctt 


ttacattttc 


tttaactggg 


1020 


cctcatgtgg 


tcccgagtag 


ctcttgtaaa 


cttagtcgga 


cagtaagtca 


ataaactctg 


1080 
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cttgtttcct cctgagcctc tgtgtgcccg gtagccacat ctcgtacatc tgcatcaggt 
tcagtggttg gtctttcccc caggttaccg tagtccagcg tcatgagttg aatgtgatcc 
gtggtgatat cttctacaga cattcaggtt ttctttcctt gttacctgcc tctgttcact 
tgtttaactc ttcttgtcta ttcaccccca ggctctgccc caccccaact ctattctggg 
gactaatccc ctggcccgct ttaggcttgc tgggaaggtg cctagcactg agctaacccc 
tcagctcctg ggtttggctc ccttgtttca gtgtttgcta aacccttttc caoccttctc 
accagtccct accttttgtt ttcaaagcct tgctccatac ttgtagctgt ttgtttttct 
tcagtgtttt tccaagtccc aaaattgtta gtacttttaa ttaaattgtg tggatgctat 
aaataaattt gataagtacc aacatttact taaagtcaca caatagtgaa acattgggac 
aaaattcagc ccctcatctt caaatcagaa atcctctttg gtagatcctt atacgttcac 
aggtccagct gtggctggcg tcagctgccc ctgattgtgg gaagcattag atcctgtcct 
gaaggagtct gggtacccgc ttagtgtcct ctggtcaaaa tgtttctagc ctgtattcct 
gggtaaacat tcagaatgac attgccaagc acaggctaag ttacaccact ggagtacctt 
tgcaaataga aatgtcctat caaagatgac accggtgatc aaagagttgg gactaggaat 
tttagccagg aattaaatct cacgctcggt gtgctaatta aataactggg gggcgggggg 
cttgaggggg tgagcaagtc ctgtctgtgg aagctgacta gtaaatatgg cacttaattc 
tgccaatgtt caggtcaagc aatttaaggc agttagcata ttttgaaagt agggaactgt 
tgttttgttt tgagacacgg tctcacttt. tagcccaggg ttggcctgga actcattttg 
tagtctagtc ggtcttcaaa ctcaaggcag tcctcctgcc ttaaccttcc aagtgctggg 
attatagacc taaaccgtag tgtccagata ctgctcagtt ttaatagata cactataggg 
aggaatgctc caaaaaagat tcatcttgta ataacgtgag catagttcag gtcagccggc 
ctggttgatc tcagctcctt cactcagggg cgagggctag ctcagttcct ctgccctggc 
tgtgatagta ctgggtagag cagcagtctt caacctgtgg gtcacatgtc agatgtctgc 
attatgattc gtaacagtag caacattgca gtcttgaagt agcaacaaaa taactcgtgg 
tgtctgcatg aggaactgtg ttaagggtcc cggcgttagg aaggtatctg ttgaggattg 
tgctttcctc tcgccctgat gttgtctttg cttccctggc tcccctttct cgcctttcct 
cctcttatgt ctggcagcat tttctcactg aaggaactgt caacatgaac ctctctctct 



1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
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cactcactct 


cattctctct ctctctctct ctctctctct 


ctctctctct 


ctctctgtct 


2760 


ctctctctct 


gtctctgtct tgcacacaca cacacaaaat 


gacaaactct 


tcccccccca 


2820 


tccaaaaaga 


aatctacctg tatttcaaca atagattaat 


gctgaaattt 


tgactcatac 


2880 


aaactaaggg 


ttttttctta aactcgtaga ttattaattt 


gaataacgta 


cagagaattt 


2940 


taagtttgct 


taagatctct ttggataaga acacctatta 


aaaaatattt 


gagggggctg 


3000 


gtgagatggc 


tcagagggtt agagcacccg actgttcttc 


caaaggtcca 


gagttcaaat 


3060 


cccagcaacc 


acatggtggc tcacaaccat ccgtaacgag 


atctgactcc 


ctcttctgga 


3120 


gtgtttgaag 


acagctacaa tgtacttaca tataataaat 


aaataaatct 


ttaaaaaaaa 


3180 


atatttgagg 


actagagttt tgtcacaaag ataaaactcc 


aaccgccttc 


acattacttc 


3240 


cttctggacg 


tggaagctgg gtaaacagga gagttacttc 


cttcttgtaa 


aatgtttgta 


3300 


ctggagatgt 


tgaaaggcca gctctgtgtt ctcaggactg 


taaattatct 


aagcattttg 


3360 


atggattggg 


ctggcttaat ttcctccctc tagttaaaaa 


gaaatgcagt 


tgtttacatc 


3420 


ttgcttgtag 


ctaatcttaa aagagagccc tgtttcactc 


aggtcttcag 


ggcacgtgtg 


3480 


ctacagaatt 


ttttggaaat gtgtgacttg cgcaaagctt 


ggtggtagag 


cacttgccta 


3540 


gaatgtgtga 


agaagtcctg tgtgtgtgtt taaaatgtac 


tttttaataa 


aacttttt 


3598 


<210> 35 

<211> 4153 

<212> DNA 

<213> Mus musculus 








<400> 35 

gatcagaaat tcaaagccag cctgagctag atagtaaaag 


gtttgttttt 


ttttttttaa 


60 


gttaaaaata 


ttttaaatta tttctgttaa ataaataaaa 


ttttaaaacg 


taaaaattca 


120 


cagcccaaaa 


ttgtatatat ggaatggggt tgattacata 


cctctaattt 


tgcatgtaca 


180 


gaacaagacc gatggggaaa aaaaaataca tctagaatta 


aacttcaccc 


agaaggatgg 


240 


tgggccaatg taatagtctc ttcctctccc agtgtagttc 


cagtccaggg 


ctaaacagta 


300 


tggtgtgtgc 


ctctgctgct cttacaggaa agcgagcagg 


cagaaaagat 


caacatcagc 


360 


cttgccttct tcctgtatga cctcctgtca atcatggaca 


gaggcttcgt 


gttcaacctc 


420 


atcaagcatt actgcagcca gctgtcagcc aagctgaata 


tccttccaac 


gctcatctcc 


480 


atgcggctgg aattcctgag gatcctctgc agccatgagc 


actacctcaa 


cttgaacctc 


540 
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ctcttcatga ataccgacac cgcaccagca tctccctgcc cctccatatc ctcccaggta 
gtctgctaac tacaggaaag- gggcgagctg ctttgttaat tagcttagtt caatggggca 
tctcctatct tactaattag aggaaaatca cactattcaa accagatcag ttgatccaga 
tatgggctct acggttccct gaagctgcag catctaaatt gaccattttc aaatcaagta 
atttggcaca agggtcccta aggaggttaa ctatgtagga agaaatctgt aagcctagga 
aaatgaagaa acacaggtta agtctcaagt cctccactgt cgacataaac caattattgt 
acaatatagt gtgcaaatac aggtttagta tatttgcaaa tacaggtttg caatgtctag 
caagattcag aagcactgtg ttagcctagt ctctttgtcc gacagcttac aaaggaagaa 
ctgcagaggg ggaggggcac gagatgaaat gattcccaat aggatttgac tgtggcaaat 
gtccttactg cgtcggcacc agaggtttct caaggagtta gtgtcctcta gaaaaccctt 
aaagaaatgt tattttatag ccctaatcca gatgtgggaa acaaggcaga tgttgacagc 
agtaggcaag atagctcagg atggtattag attctattct gctgcatatc cttggtacta 
gtacagaggg aaaggcttga cagtgataag cgccttaagt tccaggacta agaggacggg 
ggtggggatc acatgggccc aggagttagt tccagatcag cctgagcatt atagtgaagc 
ctcatttaaa aataataata ataataataa gtcatagtag tagctatttg tgcctgacaa 
tatacctgaa gttagctcca tgaccattat aatgtgatct gggacacacg tgacttacaa 
ggaccaattc caaatggact tattattgtc tgctgtcgtc tgttctgtgg gctgaatcct 
ataacttccc catgctgacc ttgccctggg tcctctcccc atgctcctct tctggctgcc 
ctaaagagtt gacagaccca gggcccctac tccacagagc tccaggactt gcagggacac 
ttacttctgt ccccggtgaa ggatgtcacg ctcccggagt aagaacacgg gacaggaatg 
gcctagggct gccaactctg gcattgtcgt agcagacacc acttctggtt ccaggccagc 
taacatccag acactgtgct cagatgacct tcagaaagga ttctggggag acagaccaca 
gcgagggact gggtacagtc cttatttctc tgtccacgtg taccctgggc tgttcatctg 
caacttttaa taccagacag gcaagggagg gattagctag ttttctttgt tctgtttttc 
ccattttgct gttctgggga cctggtcata tcaggcacat ttcccaccac taagcttgac 
ccctagccta gttttcatag tcatgttcat gttggccagt cagtgcatgc gccgtccggc 
cagagctccg taaatgcaag ccagcactgt tttaattcca gctcttcata ctcataaagc 



600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
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tgccgttggt atttctgtag aactcgagtt cctgctccag tttccaggac caaaagattg 2220 

ccagcatgtt cgatctgacc ccggagtacc ggcagcagca cttccttaca gggctgctct 2280 

tcacggagct ggctgttgcc ctggatgctg agggggatgg gtgagtatct gacgcctaaa 2340 

atggaacctg aagggaaaga tgttaacagc attcccagtt caacttctta tgtgtacagc 2400 

aagacctcag aactgtacca cttaccagtt ccaagaagaa gcactcctgt cttaagaaag 2460 

tggtactggt ggagtataga gaggcaaggt gattagaaga tccatgaaat gggcttcttt 2520 

ttacagcact ggcagttgaa ctgagagcct ctcacactgt aggccagtac tgtatcaatg 2580 

agccacattc ccagccccaa atgtctattg tttgtttgtt ttttattttc gtgacagttt 2640 

ctctgtgtag tcctggctat cctagaactc actctgtaga ccaggcttgc ctcaaactca 2700 

gaaatctgcc tgcctctgcc ttccaagtgc tagaattaat ggtgtgcacc accaccatac 2760 

ctgcctcaga aagaggtttt gttttgtttt gttttgtttt gttttgtttt gttttgtttt 2820 

gttttgtttt gtttttcgag acagggtttc tctgtatagc cctggctgtc ctggaactca 288 0 

ctttgtagac caggctggcc tcgaacttag aaatctgcct gcctctgcct cccgagtgct 2940 

gggattaaag gcgtgtgcca ccacgcccgg ctcagaaaga ctttttaaag tcccttgttt 3000 

gagcagaatt tgcagagaat gctttgctga gtgaattccc taagggcaaa cccattttgc 3060 

aggggaggct gctgaggtgg aacccagggc tttgcacaca ctaagcaggc gctcctccac 3120 

taagctatcc cgtcagccct aacaaggtca cctctgacaa agtgagcaca gagacgaaat 3180 

aaataccagc atgccactgt ccgaggctgg caaaggaagt gacggtgaag aagccagctg 3240 

tcttgggtct agatgacgct tcttatgggc aagccttctc aggcaaaggc agatgccata 3300 

ggcatgggtc tcgtgaatct tctcactgtg tccatccctg agcacctgag tgtatggcag 3360 

tgcccggact tacttgttgg ctcttggctt gtattctgtc atcttttctc tgctctaacg 3420 

acattccagt cttctcagaa ttttctgtcc tataactaac ttattttctc agcatcaaaa 3480 

agtgccctag gatatatgct gataaggtcc tgaaagaaga aaatttgcct tgcaatataa 3540 

acaacccccc attttcactt tcttaattgt ttttaataaa tagcatggct ttaagtaatt 3600 

tctgaaacta tcttttatat acacagtagc agtagttttt aattttctat ttttgtccca 3660 

gttggagaca tccctgccgt ctggttttga ttcttctaaa atcattgctg ggatattgct 3720 

atatagcttt agctgttccg ggcttcacta tgcaaactag actggcctca aacttacaga 3780 
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gatctgcctg 
ccaccatgct 
agaatgcgtg 
cttcattaaa 
acagacaggc 
ccatccccag 
aaaaagatgg 



cctctgcctc 
gagctcctaa 
gtcaaaagca 
acaatgatta 
cagggatgtg 
cactttaaaa 
ttg 



cccccaccac cacccccagt actggagtta gaggcatgtg 
tatttttgaa gagcaaaagg ttgaaataga cattttccaa 
cacaaaaaag gtgtccagaa tctctaacag tcaggcagtg 
tataccacat tctgcctact agaatagttt gatcaacaag 
gctcagtggt agatctctag catgtgctaa ggtctgggtt 
aaaaaaaaaa agaaagaaaa gaaaagaaaa gaaagaaaga 



3840 
3900 
3960 
4020 
4080 
4140 
4153 



<210> 36 

<211> 3009 

<212> DNA 

<213> Mus musculus 

<400> 36 

atagcaacaa caggagcctg 
tcagagaagg cgatgatgcc 
gcatcttgga gatgcggcgt 
gggacgctct ctctgccatc 
gtcctcctag aggactctcc 
atgataataa agtaaaatca 
tcccaagtgt ggcctgcagg 
tggggaggtt cattctgagc 
gttctctgga aaggattatc 
gccgtggact ggtgtgattt 
gatggtgtaa agtggctggg 
ccagtgcgtg gacaggaagg 
cacccagctt cttagcggag 
tccactgcac cctgggaatt 
caccttttct ctctgcttgc 
ctctgggcac cccagtctct 



tagcaagagc 
catgccaggg 
gactcctggc 
cttctgcctt 
tgttactgtc 
cgtcagtcca 
tcctctgccg 
cgctagtccc 
ctgctggctg 
tattgcacta 
tcccccttcg 
gagtctttat 
ccctgggtgt 
aaggacagct 
ctccctgcct 
tcccctagca 



aggagccaca 
taggaagcaa 
cgaggtctgg 
ggctataccg 
tttccccttt 
aacgtcacag 
gatcggtcgg 
gccgagactt 
tcctacctct 
ttaatcatta 
ttctgtctgc 
cccagaatgg 
ggccggatct 
atgcctctgc 
cgctccctgc 
tccttggtct 



tgggccctgt tgcacagagc 
ggagttggaa gtttaatggc 
ttcttctcct agcccaagtg 
aggatgcgca agagtctggt 
gctgttcttc ^acatgaaatg 
ttttaaagtt caaaatgact 
aacaggttct cacgtgatcg 
tatgctagtc aggaaactgt 
ctgttaagtg gcggcccgat 
tcatgggtga agctgccgta 
acagggctac aataaataag 
ggttcagcct ggagcaccat 
cctctgggat ctctctattc 
ttgagcaagc ttcccagccc 
cgggggagtg gcctcgcttg 
cctattcact cccctgttca 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 



65/186 



WO 2005/005597 



PCT7US2003/027106 



tttttgtttc 


cagtgaagtg 


tgccaccccc 


agccctccca 


cctcccctgc 


ttccccgatt 


1020 


ccaccagtca 


accccctctc 


gtctgaccgc 


ccacggggcg 


ccgacagcag 


ccataccatg 


1080 


cgggtctgag 


ctctgactgc 


aagccctggc 


tgaggccaat 


gctgtgaagc 


tccacagagc 


1140 


caccttctga 


tagcatccat 


tgcacacctg 


gggctctggc 


ttctcaccct 


ggcctcctgc 


1200 


ccttccaccc 


accatgggaa 


cacgtcagag 


agccaggctg 


gggaccgggg 


ctgcttcata 


1260 


aaggaacatg 


gatgccttca 


agttcacatc 


tgctgccctt 


tccctgaagc 


ctggcactgt 


1320 


cattttatgg 


ttttaaggca 


agacccgggc 


atggcaaggc 


caggatggcg 


tcctctctga 


1380 


tgcccctgtc 


acggggagct 


caagtgcagt 


tctggatgga 


ttgtgtggcc 


ctcctgcacc 


1440 


atcccccgtg 


gagtccatgt 


gctggtggga 


agctcatgct 


atgggtgagg 


gctagaagtg 


1500 


aagacaagac 


agactccatc 


ccttggaacc 


cgtacaacac 


agcgagaggc 


ca ggtcttgc 


1560 


catcaccttc 


ctcccattca 


gtcccagctg 


cctcagcgat 


gcccaaggct 


ttggcacggc 


1620 


tctgctgatg 


ggtttcccag 


agttcactgg 


aggccagcta 


ccctgcttga 


gccaaagaag 


1680 


acgatgagtt 


ctagggagag 


gctcctgggc 


tcccagaggg 


gtcaagtgtg 


tgacagagag 


1740 


acgacagcag 


gtctgcacag 


tgtctgaggg 


caagttggaa 


gcaaggagca 


agatggaaga 


1800 


gaaaagaggc 


ttagagagtg 


aaggaagaga 


aggcagacgc 


ttttcacaag 


caacagggat 


1860 


gtaaagaagg 


agggaaatgg 


gaagggagaa 


tagaaatggc 


ttccctagtg 


tggagcctta 


1920 


ggtcagtgcc 


aagcagaggg 


gctgtcacct 


ctgtaccttc 


acgtcttcct 


cgggagcagg 


1980 


aggcgccagg 


aggactcatg 


ccaggcacat 


gccagctcca 


actgaggtgc 


ttggtagcaa 


2040 


ggtatgaggt 


aaggggttgt 


tagagtgcta 


tagcctgtga 


gatggtccta 


tctgtgtcaa 


2100 


ggcctgctgt 


ctctctccca 


gggtcatagg 


cagagagaag 


acggtctcat 


atgaagtctg 


2160 


tcagccttgg 


ggccttacct 


agccagttta 


aacccggaaa 


gtactgtggg 


ctgactgagg 


2220 


tttgccctcg 


gaggaggaat 


gaggaattaa 


ctgtgaggcc 


aagttctagg 


tccttccttc 


2280 


tcatctcagg 


catttagagc 


agggccagat 


gctttcctcc 


accccacctg 


cccagggagg 


2340 


acaggacagg 


gagagaccct 


agcagagcag 


aatcttcctt 


tagcccacct 


accgtgcgtg 


2400 


aatgtagcca 


gacagcagca 


aaggaaggct 


agcttcagac 


accaagccac 


cagacctggc 


2460 


tctccacaca 


tttttgccca 


gagacttcag 


cctgaacatc 


agtggcccag 


gaaacaactg 


2520 


catcagctcc 


catcaatcca 


tcaccactcc 


gtcatgggtc 


gggacagtta 


ctggttcata 


2580 
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2760 
2820 
2880 
2940 
3000 
3009 



tgcaagtaaa gatgacaatt ctttcaacaa aaattagtga agcactctct gtgtatcagg 2640 
cactgttcta ggtgcttagg atattgtctt gcttctgagg gactctggat cgggaggatt 2700 
ccacctgttt ttgcctctct ctggcttctg gaaagtgctc cttcatgaca tcccctatct 
gcatagttca agcaccccaa atctgcccct tcctccttaa agactggtag atgggatcag 
ctcccaggtt accctctctc cccataccct cttccaaaag aaacaaaatg ttagggcttt 
ccccgtcgtt gtctctcctt tttttaaaaa gtggtatttt ttagaaatac atgtggaata 
ccaagaaatg tctttgcctc cccaccactc tcacctacat ttcataaagc tggctcttta 
tgttgcttt 

<210> 37 

<211> 1599 

<212> DNA 

<213> Mus musculus 

gtgaagcltt tgcatgcaga gttggggatg gggcacagag gggagagacg agaacagtca 60 
gtcggtgggg tgcagtctgg aagagtgctg tctaggaaca cagaagtaat tagcaggaga 120 
aacagctgca ggatttaaga ttggattttc cgagaggatg aaattggttt tgaagtaaag 
gtggatccag cttttgtgtt tgacctttac cctgtggaat aacttgattt ttctaagctg 
caagctgttc gaaacctctt ttagcccatc agtggtttgt ttcattttgt tttgagatgg 
gctttcactg tggcgaccca tgctcttggc ctccgtggaa cagctccggc cttagcctct 
caagtgctcg gattacaaca tgtaccacac caggcccatc tgccagactt ggagtaaatc 
accaagtctt aggagccctg acacagatgc catctgccac aggcatcttc ccttctgcct 
ttgtccttcc cggctgagct ccagattgta gaagacatct aaggttccag tatgactcca 
tccatggcaa attcaatggc acagtcaagg ccgagaatgg gaagcttgtc atcaacagga 
agcccatcac catcttccag gagcgagacc cctctaacat caaatgggac gatgctggta 
ctgagtatgt catggagtct actggcatct: tcaccaccat ggggaaggcc gggggcccac 
ttgaagggtg gagccaaaag ggtcatcatc tccgcccctt ctgccgatgc ccccatgttt 
gtgatgggtg tgaaccagga gaaatatgac atttcactca aggttgtcag cactgcatcc 
tgcaccacca actgcttagc ccccctggcc aaggtcatcc atgacaactt tggcattgtg 
ga agggctca tgaccatggt ccatgccatc actgccactc aggagaccgt gaatggcccc 



180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
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t c tqqaaaac 


t ant" at" rr;» 


*-ggc*-«xggg 


gctgcccaaa acatcatccc fcgcatccact 


1020 


crcrtcrc tac c a 




caaggtcauc 


ccagagctga atgggaagct cactggcatc 


1080 


accttccatg 




^ :a s 4~ o 4~ fi 1 4- 

CddCatytCt 


actgtggatc tgacacgctg cctggagaaa 


1140 


cctaccaaat 




gaaggcagcg 


aagcaggcat ctgagggccc actgaaggac 


1200 


a t d c t* a crcr r» t* 


3 a a 4™ 4™ r*r ^ 4— <^r^ 
ddaLLy dLyd 


ccaggucgcc 


tcctgtaact tcaacagcaa ctcacactct 


1260 


IvW^CLwV^ L- I— y 


atgccggggc 


tggcattgct 


ctcaatgata actttgtaaa gctcatttcc 


1320 


tggtatgaca 


gtgaatatgg 


ctacagcaac 


agggtgatgg acctcatggc ctatgtggcc 


1380 


tccaaggagt 


aagaaaccct 


ggaccaccca 


ccctagcaag gacactgaga gcaagagaga 


1440 


ggccctcagt 


tgctgaggag 


tccatatccc 


aactaggggc acccaacact gagcatctcc 


1500 


ctcacagttt 


ccatcccaga 


cccccataat 


aacaggaggg gcctaggagc cctccctact 


1560 


ctcttgaatg 


ccatcaataa 


agttcactgc 


aacccaccc 


1599 



<210> 38 

<211> 2627 

<212> DNA 

<213> Mus mus cuius 

<400> 38 



gagctgggga 


aaaccaagta 


ctattattct 


gttaaaggaa 


cataggcatt gtggtgtatt 


60 


caaaaataag 


ggccatgctc 


agctatcatc 


agagatgttc 


ctgcaacagt gcaagaaata 


12 0 


aatacagagt 


cccacagcca 


ggtattgtgt 


agagagtgat 


atgcatagag tttgcaagga 


180 


tctgcacctg 


aaaggggtcc 


cattgttgag 


agaagtagac 


acatgcctcc agccttaaac 


240 


aagaagctat 


caccaatgac 


cacttaaaaa 


tgaagatttt 


tttaccacct gtagtctcaa 


300 


taaggataaa 


aacccctctt 


aaagacagcc 


cctatgccta gtaatagaga tggccaacac 


360 


aaaatgaact 


caatgacatc 


tttggatatt 


atttgcatca 


ggtttttcca agtcaatttt 


420 


ttaacctttt 


aggcactttg 


aatatatatt 


atttccaagt 


catttttatg ttattcttgt 


480 


ggctacaaag 


gtatacaact 


ctgcatttat 


atgatttfcct 


ttggatcatt ttctatgttt 


540 


tttttcctgt 


ttccattggt 


ttgatatttt 


tctttatctt 


attttatttt atatatatat 


600 


tttaagattc 


ttgttggtct 


tcttacaaga 


aacaggcagg ggatgaatct tttttttttg 


660 


ttttttttgt 


tttttttttt 


ttgttttttg 


agatagggtt 


tctctgtata gccctggcta 


720 
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ccctggaact cactttgtag accaggctgc cctggaactc agaaatctgc ctgcctctgc 
ttcctgagtg ctgggattaa aggtgtgtgc catcacacac ggctgggatg aatcttaatg 
gaagttgtat aaggggaaac catactcaaa attggtttta ttttaaaaat atctatttta 
aatctaagaa aaataggaca gaatttctaa ttcttcgtga aattggcttg aacatatttc 
tgctatttta aatatattct cagctactct gattacaaaa ttattatatc attcaaccgt 
gatgtttttg aaaagctttg ggtctaccgt atcaaatatt tagctaacat ttaatttcct 
tataaaataa gaacctctgt atctttagta ttcaaatttg tttttgtctt aaacataaaa 
tttattttaa aataacttta tacttactat cagtaatttt atttttatgc catttacata 
ttttaaaact tgctacagta taaaattttc ttgtgtatac aggattgtaa atagaactgc 
ttttaattta aaaaaaaaaa gaatgtcact taagattgaa tcctgatata caaaagaaaa 
tttgaacatc ttagaaatta gcattaatgt caatcagatt ggaaactcaa aatttaatgg 
cgaatagttc ctaaactgac tgaaaatacc ttcaaaccat atccagaaaa gtgtccttt. 
acagtaagca tgaaacgaag actagaaaaa ataaagtcct tacaatgtat taaaatcatg 
cagctaacaa tcttatttac tttaaagatc tatgaataac aacaacaaca accaaaatgt 
caccatgacc tgaaatgttc agtaatagca tgtatgccta tatggtaacc aacagcgttc 
ttactggact tattacccac actgtggaga tgagggagaa tcatgcctgg aatcagaaac 
ctgtgagagt gtccaacaat actgaaatca aggataagaa tgcactatgg te.ttt.tt. 
aaccaacaaa atctctaact aaattctaaa tgtttgtcat tatgctgtac atagataaga 
ga agtcctca ttactcatca agaaattttt c t g t gcagca aatgaagata tttaaacaac 
caattgaaat ttagagttgt gaggccaatt tcccaatgtg tatatatcca taactcctat 
acttaaggct taggaagcat tggcaaaagg acacagaaac ctggagtttg ctgtaagact 
atgatttcca gaaatgtgag aatttgcagc tatggagtct caccaaggct tccaaaatgt 
tgcctgaact gtgataaagg caatagacat actaacatca agcagcaaaa tctcatgacc 
cctcaaatct agacaaagaa atacagatac ctaaagaata ccgagaacag gagaaatagt 
ct acctcaaa gaaaagcaga ttaattggtg atccaataac gacagttcag gtctgaaaac 
atacaagaaa cagtactcag aaccaggctc tatttatgta attaggagtg ttcacatgca 
tgcatgtgca tgtgtgtgtg tgtgtgtgtg tgtgtgtgtg tgtgtgtgta gtgtattttt 



780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
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tgtatatgta 


tgtatatgca 


tatttgtaga 


tgcatgtgtt 


tgtatgtgtg 


tatgtgctag 


2400 


atacagatac 


ctatgtcaca 


gcaattaaag 


aaagtcatga 


atttgaaaag 


ggggggagat 


2460 


ataaatgaaa gggaatgaga 


aagtgatgta 


attatattat 


aattttgaac 


1 

tgaataatct 


2520 


tctaatttca 


aacccgtcca 


tatatatgta 


agttcaaatt 


ctagcttttt 


caatttaaat 


2580 


ctacctcaat 


gccagctgtc 


tctgtctttg 


tctttgtttc 


cttatcc 




2627 



<210> 39 
<211> 1854 
<212> DNA 

<213> Mus musculus 
<400> 39 



tcaaaactaa 


ctccttgagc 


tgaccaggaa 


agatgccctt 


gagcagcatt 


gcagttttta 


60 


cagcacagac 


aacccagaaa 


cgtacatact 


gcctaaagtg 


atggtctgcc 


cagcatctct 


120 


ctctccagcc 


agctctcaga 


gctgcccaga 


gcttcaacac 


cttgaaccct 


atgtgcttag 


180 


gagtgacagt 


agtcaggata 


gcccagcagg 


cagctgggta 


tactttattc 


agggcacttt 


240 


ctgtacatgt 


ggggttagtc 


tttatatgta 


ggagattatt 


aaatgccctt 


ttgctgcttc 


300 


ataataattg 


cagatacctc 


atctttgttt 


cctactactc 


ttctttgtag 


aggggcctga 


360 


tggcctctgg 


ccctcctcta 


tcataattcc 


atgattcctt 


cccctgcttc 


tgtctttcct 


420 


ttctgcggca 


gcttctctca 


accctccctc 


ttcattctac 


acattgtagt 


gggggcaaat 


480 


ttttacttga 


gagatacgta 


catagtatgc 


agtttatcct 


tgtacagtgt 


acatttgatt 


540 


gtcgtctatt 


tacagagctg 


tggaaccatt 


atcatcatca 


gttttagaac 


atatttatca 


600 


ccagaaaaat 


cctgttccca 


ctagcagtcc 


tccttcacct 


ctcctcagac 


ctggcaacta 


660 


ctaatctttc 


tgtccttgtg 


gatttgccta 


taacggacct 


ttcataaatg 


ttatatgcga 


720 


tgtgcaatgt 


gtctgtttac 


ttagttttaa 


ggttcatctg 


tgccatggta 


tatatcatca 


780 


tttcaagtag 


tgagtacctc 


tctctctctc 


tctctctttc 


tctctctctc 


tgtgtgtgtg 


840 


tgtgtgtgtg 


tgtgtgtgtg 


tgcgcgtgtg 


cgcttagaca 


ttttcattgt 


gtggccctgg 


900 


ctgcccttag 


aacaggctgg 


tcttgagttc 


acagaggaag 


ctctgcttcc 


caggtgctgg 


960 


gattaaaggt 


gtgtggcact 


gtgcctgcca 


gtgggtttgt 


tttgttttgt 


cttttctctt 


1020 


cagtttttta 


aattgtctaa 


atatttcatt 


acagtgcata 


taatgggcgt 


tctgtgattt 


1080 


ctagtcagga 


cgatcatgaa 


taggcttctg 


tatatatgtg 


tgtgcaacac 


cccttcatgt 


1140 
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tctccttcat 
ttgtttttgt 
atttgtagac 
gtgctgggat 
atagaatgga 
gtgttcccat 
aatttgaagt 
ccagacccag 
cttagctggt 
tcttcctcct 
tttgttttgt 
acaaccaaaa 



ggacatgtgg 
ttttgagaca 
cagggcctgg 
taaaggggtg 
tggatcgtaa 
ggaggatatc 
gtgtttcatt 
gacttgcaga 
tctgagttgc 
gtgtctgtgt 
cttttttttt 
atataagagt 



gggttttggt 
gggtttCCCt 
gcctcgaact 
caccaccaca 
ggcagctctt 
aggattccag 
gtgtatgtgc 
tgtagacaag 
agtttcctaa 
ctgtggtttt 
ttttcaaaga 
tcggttccta 



ggtggtggag 
gtgtaacaag 
cacagagacc 
cccagcgtgt 
acctttagtc 
tttatcaagt 
tgtgttttat 
tattgtacca 
taactagtga 
aagatagggt 
tagaatttct 
gcacccatag 



gtgggttttt 
ctctgtctgt 
ctcctgcctc 
tttcattctt 
tttggaggaa 
cttcaccaat 
tttgtgggca 
ctgggctcca. 
tgttggccat 
ttcatgtgtt 
caggctgggg 
ctgtctctcc 



gtttgtttgt 
cctggactca 
tgctagccaa 
ttgtatagga 
ttgccaagct 
attgtttact 
gtgcagggga 
tccacagtcc 
tatgtcttcc 
ggacaatatt 
aaataggctt 
agcc 



1200 

1260 

1320 

1380 

1440 

1500 

1560 

1620 

1680 

1740 

1800 

1854 



<210> 40 

<211> 3683 

<212> DNA 

<213> Mus musculus 

<400> 40 

acgattataa actgaagaca 



atcactactg atcaaactca 
ggagtggaac ccgggatctt 
cctcccacac gtacagtctt 
tgacaggcag agtagcccca 
agtctcacca atgcacattc 
cttaaactcc cccactctcc 
aaactggagt ggcaaggtag 
acttcaaagc ctagctctag 
cagaatgagt tccagaacag 
ctatatccag ttgctgctgg 



cttaattctg 
ggatccatgt 
gtattctagc 
aagcatgctg 
gcaacccaca 
ccaagtaact 
ccggcctcag 
gccatggtta 
gaagacagag 
ttagggggac 
cattgggcag 



gtagatacgt 
gtgaactgta 
aagtgctata 
ctgcccacac 
tacagatgtg 
agaacctata 
tgcagacaat 
ttaaagaaac 
gcaggtggat 
ctaaacaaac 
gtgccagctc 



agactcagct 
ttcatttctc 
cctttgagcc 
tgacacacag 
tgtatctaca 
agcttctgtg 
ggggccagct 
cagacggggt 
ctctaggcta 
cctgtatcta 
cacatacacc 



gataatacca 
tgtggtgctg 
acacttgagc 
agatgcacac 
cagatgactg 
atgcttggtt 
actaggcagg 
ggtggtgcac 
gcctggtctg 
aaaacaagag 
tccatggcag 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
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tgggcacacc 


ttcctcttcc 


tctagggaca 


gcgactcaga 


cttgtacagt 


taccagcccc 


720 


agagctgtct 


ctctgcttca 


taagatgtca 


ctcctcattt 


ttaggggtga 


gaaaatgggg 


780 


aagaatatat 


aggctgtaga 


caggataagg 


agttgagcca 


aggccaggtt 


ccctccactt 


840 


tctctggtgt 


gctgtttgtg 


cacacgtgtg 


agtatctaga 


cttgcaaaga 


tactgcttct 


900 


caacaggagt 


ctgggatgta 


agagacaggg 


actgggggct 


ggagaagtgg 


ctcagggctt 


960 


aagagcactt 


attgctcttt 


ccaaggatct 


gggctctatt 


cacagtacct 


ccagggtagc 


1020 


ttaccaccat 


cttgtaactc 


caattccatg 


gggatctgac 


gccctcttct 


gatctgtggg 


1080 


caccaggcgt 


gcaaaacatt 


taaaatataa 


ttaagtcttt 


tttaaaagta 


actttaaaag 


1140 


tcgtttattt 


ttatttcatg 


cgtattattg 


gtgctttgcc 


tgcgtgtatg 


tctgggtgag 


1200 


gatgtccgat 


ttcctggaac 


aggagttaca 


gacagctgtt 


agctgccatg 


tggttgctgg 


1260 


gagttgaacc 


agagtcctgc 


aagagcagcc 


agtgctctta 


accacagagc 


catctctcca 


1320 


ttccaagtca 


ttgcattttt 


aaagagttca 


aagaaagaag 


gtgggatggt 


aggtaggtac 


1380 


tgagcctctg 


cagtggggta 


aagtcagatc 


aggactggta 


tttgtggagc 


atgcccacct 


1440 


ggctcagttg 


ccctctgcct 


gactcctccg 


tccatgctcc 


atctatgctt 


atctttggtg 


1500 


tgctgtgttt 


gcttctacag 


atggggaggt 


tgcctgactg 


tctacccatc 


ggcctggacc 


1560 


ccagagccaa 


gccagctgag 


tgagtgccga 


gggtcccagg 


tgaccggggg 


agtggcactg 


1620 


ctcaggcctc 


tgcaaagact 


accctggaga 


agctgatgct 


ggtggagaca 


gccctttctt 


1680 


gctgagtcca 


gcctctgcag 


agcctgcgca 


ggtaactgtg 


agccagtgta 


acaaagatcg 


1740 


cctcttagct 


taggcagaga 


gagagacatt 


aatctagcct 


attcgaactc 


cccattatcc 


1800 


agagaaggga 


atagaggctc 


atgtggcaca 


gtgagtccct 


tagcatcctg 


cctcctagcc 


1860 


tgaggactct 


tccctatgcc 


tccatgacct 


ggagaactgg 


gcggagagat 


gggttgtgac 


1920 


ggtgaccagg 


attcagggca 


gaggtggaag 


ccagcgtgcc 


tgatatatgc 


agctgacctc 


1980 


cccaggactc 


ctctacagag 


ctgagaggcc 


atggtagtcc 


aggtgtgatt 


gcatgcccgg 


2040 


cagctggccg 


cagggcaggc 


catggcagga 


cccatctttc 


ttgtggccca 


ggtttagccc 


2100 


acccaggctc 


ccagccacag 


aggccaggtg 


gggctgccct 


gccctataga 


ggcaactecc 


2160 


tgaactgaca 


catgaagcct 


caggctccag 


gaagctcctt 


agagtttagc 


ctgctggaaa 


2220 


ccccacccaa 


cttccaaagg 


agcattcctg 


aggtctgcat 


gggggttggg 


cctccaggga 


2280 
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tgggaagggt ggggaggcag ggctgcagga ggatgtgagg ctgtaaatgt gctgtgctgg 
gtctgtgtgc aagcaggtgt gtggggctct ctgtccttag tgcacagggt tgtgtatgca 
agagttacat aagtggccta tgtgggggga atagttaaga gcttcacatg tgtgctgtag 
tggtgactgc tggctatgga tcctatgcag gtgtgggtat cctgggctgt ctaccacaca 
gggtactagt ggaccaagac tcaaacaccc ttgggtgggt ttgtgtgtgc tcagggttct 
gagcacaagg tgaggtggca tgtgcacatg gctccctttg tgttttggag gtgggctggg 
caggacttgg gctgcgctgg gctcagcaat gcccagcctt gtggccaaac tgctgaagtc 
acttcctttt caacagtgac tttccagggg gaggagttct tccgacctgg tcggcagtga 
ggggcgtggc cgaactggct ctctctgggg agggatgtgg gcgggggcgg gagagaggtg 
ggaggaggtg ctagtggtac tggaagaggg agtcctctgg gagacagaag gaaagacaag 
gacaggagtc tggaagggcc aagggcagga gggaatgggg gtggagtggg gctgaaagca 
cagtccctgg gtgacctcgg agggaggaag ggagggctgc caatgaggtg accctcgggt 
tcagtgagag ctggcagtgg cgcctacaca cctggcactc ggggcaaggg gctggcaggg 
ggcggttcag gaacagacct gcttgccagg tgccccactc tggacaggaa gagggtgggc 
gggggctgta caaaggagct ctgtgtggct gaggataggg tagggtgggg tatgcagtgc 
tgtactgttc tggggttggg ggagatgatg ggggcggggc aggaccagtt ccccttgggg 
catcagtggc tccaggggga cacctagtgg tcaggggagg tagtgcatct tgataacaaa 
ctgggggaaa agagattaga agtggtagtt gagatagttg aggaggccag ggctagtctg 
aatctttgga tgatgaagca atttgactta aaggatccca acaaaaccaa acttaggtga 
caacaaagct gattggcatg gctgtgtgtc cttaagggca tgactaagcc tctctgtgtt 
cacatttaaa tgcaaaacaa gtgactgggg ctggtgagat ggctcagcag gtaagagcac 
ccgactgctc ttccaaaggt ctagagttca aatcccagca accacatggt ggctcacaac 
catccctaac gagatctgac tccctcttct ggtgtgtctg aagacagcta cagtgtactt 
acatgtaata aataaataat etc 



2340 

2400 

2460 

2520 

2580 

2640 

2700 

2760 

2820 

2880 

2940 

3000 

3060 

3120 

3180 

3240 

3300 

3360 

3420 

3480 

3540 

3600 

3660 

3683 



<210> 41 

<211> 2311 

<212> DNA 

<213> Mus musculus 
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<400> 41 



aaatgtgcgt 


tattggggtg 


taattgcgat 


tQctaacacrt 


1 \~ aa n\~ a net 


«. y y «g c ci c 


bu 


agttgtatat 


atccaaggct 


gatacagaat 


ttQcrtaacrac 


ttattatcrfcfc 




*t on 


gggagacaag 


ttgctggaca 


tcaagctcag 


gggttatgcc 


tgctatgaaa 


Qactacraar 


1 fin 

J.OU 


aattgagccg 


catcccagtc 


tgtctgtgtg 


tcttgtcttg 


tctttttctt 


cct*t~r , r , t~t~r , r i 

I— WV« L> w V-# L 




ttccttcctt 


ccttccttcc 


ttccttactt 


ccttccttcc 


ttccttcctt 


ccttccfhcr 

L LUUL 1— V- 


inn 


tctctctctc 


tctctctctc 


tctctctctc 


tctctctctc 


tctctttttfc 






gttttttgag 


acagggtttt 


tctgtgtagc 


cctggctgtc 


ctaaaachca 


v- L>y Lay ctct 


ao n 


caggctgacc 


tcgaactcag 


aaatctgcct 


gcctctgcct 


ecttagtget 


crcrcra fhaasn 
yyyau Laaay 


/on 
*l o u 


ggcgccacca 


ccgcccggct 


atccctgtct 


ttcttgattg 


caaaatttaa 


cLdv^ciL^yyy^. u 


An 


taatgaagaa 


gttgcttagt 


gagataaggt 


acaagttata 


tttfctaaaat 


*- ctctct LLLctLct 


enn 
OUU 


aaactacata 


atacttttga 


aatcatttaa 


tgttcaaata 


tttttatfccc 




££n 


tttaaagaag 


tatcacacag 


aaaagcttaa 


aattatacat 


cracrattcrcfca 


taata era cr t - a 




tatgagctgt 


agcaatttca 


ataaaacaga 


tgcttgaaaa 


attatgtaat 


gactgacctt 


780 


actaattgtc 


ataaacttca 


catgtcactt 


QtaatacrcjQC 


aaaaactgee 


tctattaaat 


840 


gttttataga 


aacctgcttt 


attgcatttt 


ag 1 1 a cagga 


ataattcttt 


tort taaaata 


900 


caacctccac 


tttgtcctct 


ggaagagtaq 

J3 ZJ ZJ Z3 


agaccttgtt 


aqatcrt t cacf 


aataaaaaaa 


960 


aagatacgag 


atatataaac 


tgtattgtga 


aagctctgtt 


ttagattgaa 


agetgectgg 


1020 


aaataggtac 


tacagttttg 


ttttttcaac 


ttataagctt 


attaaaaatt 


cacgaaaagg 


1080 


agtgtacaag 


acacacatct 


atctaggatc 


cttttttttc 


ttcctgggac 


acattcaaga 


1140 


gagagcagtg 


ggttctgaag 


tgccatttgg 


ttttgcattg 


cctttatttt 


tggtctagee 


1200 


gcactgttgc 


ccagacttgt 


cttgaactca 


tqqactqaacr 

ZJ Z) ZJ^ ZJ 


ccatcactct 


gcctcagcct 


1260 


cttaggtgca 


tacatggtca 


cctggcgctt 


cttaccgtct 


ctgctacaag 


agtacttgea 


1320 


cacacaattc 


acattgttac 


agtaaggcat 


taatgcagtc 


agacctgttg 


ctaagggtca 


1380 


gggttccaga 


atcttacttg 


tgctaaatta 


atgtgtgttg 


gttggttagc 


tggctggctg 


1440 


gctgttgttg 


aatttgtatc 


aaccttagtg 


ctgaaattac 


aaatgtgtca 


tcacatcgac 


1500 


ttaacctaat 


ataattgttt 


ttgcccagtt 


agactattca 


ttttccaaag 


tccacttaag 


1560 


gactttgttg 


tgtatggttc 


taggtagctt 


gccacagtat 


ccctccctcc 


agattctagt 


1620 
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catgagggag 
ggtctacagc 
caggtgtgtt 
tagtcttagt 
atctttgatt 
ttcttatcat 
gtgtatgaaa 
atcttcctac 
ttctctggaa 
tttcttttct 
tttacttttg 
tttaaaactt 



cttc.ctgctg 
tttttacaat 
tgctacagac 
ttaatgaact 
tactgtcttc 
ggccacctcc 
aagaggcaga 
ttacgggcat 
gagtggcaaa 
cctcttgcct 
taaagtgttt 
aataaatttt 



gttgttgctg cactgcatat 
actttgataa tttgaactat 
cctttgtcag tttcagcatc 
tgaatttgta agatttatat 
tattgattat atctattctt 
caactttatc tcctcctcaa 
gaagatataa atatgccatg 
tgtgagccac ctgatgaggc 
tgccatggaa tcttctccag 
atggctttgt ttttgttgtg 
ataaaaaccc atcagcattt 
ttgagaattc c 



cagtttcttt 
atttgcatcc 
aattgttttt 
aatgtatata 
attagtccct 
agtagttttt 
tgtacccaca 
tactgggaat 
ccccctttcc 
ggggagtggg 
ttcctactct 



tgacagaaga 
ctacatttta 
ctaaaaagtg 
tatcctttga 
ctggaccctt 
ataagaattt 
gaggcattgg 
agaacttgtg 
tttttttttt 
gatctttatc 
tgcttccctc 



1680 

1740 

1800 

1860 

1920 

1980 

2040 

2100 

2160 

2220 

2280 

2311 



<210> 42 

<211> 2421 

<212> DNA 

<213> Mus musculus 

<400> 42 

tctctctctc tctctctctc 



tctctctctc 



acagcatcag ctatccatgt tatttacctc 
atggaagtgg gagataaaga cagaggtgta 
cctcttcacc cagttacatc aagttccaga 
ccaagtgttt aatacataag ccttttggga 
atgtgctggc atacagtgct cataccctca 
tctaaggtct tcctgttagg ccagtgaggc 
ggctgacacc ttgagtttga tccccagagt 
aaggtggttc tctaacttct gcatgcacat 
agtgtgacgt aactcattag taagaaaatc 
tagagatttt atttatttat ttatttactt 



tcttttaata 
agtttagcca 
gggaggagag 
accttccagg 
acatacttcg 
tctgagccta 
agctcagtaa 
acacatggtg 
cccctcctgt 
aagtcccact 
ttaaagattt 



ttaacaggag 
gttatttacc 
ggagaggagc 
acagtgccag 
tatctaaatc 
ctcccccgta 
ttaatggtgc 
gagggagaga 
cctgcccgac 
cctcaaaatc 
atttaatata 



accaccatgg 
tcaggacaga 
acctccgtga 
cagctgcatg 
acagtgagct 
accctgcgaa 
ctgatgccaa 
cctgactttc 
aagtaaataa 
tttttttttc 
tatgaataca 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
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ctgtagctgt 


cttcagacac cccagaagag 


ggtatccgat 


ttcattacag 


atggttgtga 


720 


gtcaccatgt 


ggttgctggg aattgaactt 


aggacctctg 


gaagagcagt 


cagtgctctt 


780 


agctgctagg 


ccatctctcc aattctttga 


aagtggattc 


ccccccatga 


gtggttagca 


840 


ccatttttct 


gtatgcagaa tcgtggcacg 


gggacttaag 


ctgcttctaa 


actacaagtc 


900 


gtcttagctt 


ggccgttatc ccaatcggta 


cggccacacc 


cagggattcc 


atctgtcacc 


960 


aagctcctga 


gtatcttggt gtacttgatg 


cttgttggaa 


ttgggttttc 


tctggtcagc 


1020 


tgaatcattg 


gccacccatc tactaacttc 


acaatttgca 


aatctaggtt 


actgttttca 


1080 


tcatcatttt 


aatttttctt tatgaatcta 


taccattaca 


caaaatgtat 


tattttcatc 


1140 


ttgtggggga 


attcaggaga ggatggaatt 


aaaatgtgtg 


tccagctaat 


actgatttac 


1200 


ttggacatct 


acattttatt gacagaaaaa 


acatgctgtc 


aaattgtttt 


attaaggcag 


1260 


ttccctccat 


cctggactga ctgacttaga 


aaacctccat 


caataaaaga 


cattgtcttt 


1320 


gtcataatgc 


ttgcagtttg aacagacagc 


attgaataat 


tatggaaata 


aactttgatg 


1380 


ggctctgaga 


aggaaaaaaa gtctgatggg 


aaacatgaat 


atttacagta 


tagattctac 


1440 


tagaatcttc 


caaagggcca tcctcactat 


tggagaaact 


ttttagtgta 


acgaccacag 


1500 


cactcaggat 


accagctgtg aacagtggtg 


ccattaacca 


ccaagaggga 


gcacagtcta 


1560 


gtgacaaaaa 


gatagaaata aagggagttg 


gagtcacatt 


tcagagttct 


ttggcaagat 


1620 


aaagctgttg 


gcactaaatc aatacactca 


tcatgactgt 


gtcatcaact 


ggacaacgtg 


1680 


ctaggagacg 


gttaattgct ttattctttt 


cttttttgaa 


tgtgtttgcg 


tgtgtgtatg 


1740 


tgtgtgtgca 


tgtgtgtgaa tgtgtgtatg 


tgtgtgtgtg 


tgtgcatttt 


gtgtgtgaat 


1800 


tgtgggccac 


agaggagcag tggaggtcag 


atgacaacct 


ccggttggtc 


ctcaccttct 


1860 


actgtatttg 


aaacaggacg tcttgtttgg 


tggtaccact 


gtatatacac 


atcacactaa 


1920 


ctgctcatga 


ccttctggag catcctacct 


tggcttcctg 


tctcacctcg 


gctgcactgt 


1980 


cgtcacagaa 


tatacactac cgtgtcccca 


gctttccatg 


gcttcagctc 


ttaatgacac 


2040 


gtgtttttgc 


tcaccgaacc ctctccccag 


cgtttagcgc 


ttattcttgt 


atgaacaaac 


2100 


ttgtatctaa 


cacctactgc atgcacagac 


tacagatgct 


agctgttaag 


agttatgcaa 


2160 


tgacatccct 


catagaaacc atgcatgtcc 


ttgtttggtt 


tttctgccct 


taacctcttg 


2220 


aagttgtgga 


tttgaaacca cagatgagat 


ggatgagctg 


gtgagatggc 


tcagcaggca 


2280 
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catgtgcttg tcgtcaagcc tgagatagag tccctaaaat ctgtgtggag gaagaaggcg 2340 
tggaacttct aacagctgtc ttctgacac* cacatgtgta ctgtagaatg ccctgcccac 2400 
ccccataaac aaataaatgt t 



2421 



<210> 43 

<211> 2545 

<212> DNA 

<213> Mus musculus 



aagcagtctc tacatggcct ttcctacagt ctctgatcta ctctttccct ctgtatttcc 
tttagacagg ggccaagcct aacctctgta tatggtctct acaagttctc tctccccttt 
gtatttcaaa taatgtcatc actgttgggt tctgggaacc tcttgctttc ctggcatctg 
ggactttatg gttgctatcc ccatttcacc atcccacact gctatacact tctgtttaaa 
tttgtgaccc tctgtacatc tttactgttt cctcccacac ctgatcttgc ccccttttgc 
cccgacctct ctctctcctc actccctccc aagtctatct ccctcattcc atcttccagg 
atggttttgt tcccccttct aagtaggact gaagcatcca cactttgatc ttcctttttc 
ttgagcttca tttggtctat gaattgtacc acgggtattc tgagcttttt ttgtaatatc 
cacttattag tgaatacata ccatgtgtgt tcttttgtga ctgggttagc tcacacagga 
tgatattttg tagttccatc catttgcaga atttcatgaa gtcattgttt ttaatagctg 
agaactaatc cattgtgtaa atgtaccaag ttttctgcat ctattcctgt tgaaggacat 
ctaggttgtt tccagattct ggctattata aataaggctg ctatgaatat agtagagcat 
gtg tccttat tacatgttgg agcatctttt gaatatatgc cccagagtgg tatagctggg 
tcctcagata gcagtactat gtccaatttt ctgaggaact gacaaactga tttccagagt 
ggttgtgcaa gcttgcaatc ccaacaacaa tgaaggaatg aatgttcctt tttttccaca 
acctcactag catctgctgt cacttgcgtt tttgatctta gctattctga ctagtgtaag 
gt ggaatctc agggttgttt tgatttgcat tttcctgatg actaaggatg ttgaacattt 
ctttaggtga ttctatgcca ttcaagattc ctcagttgag aaatctttgt ttagctctgt 
acccattttt taatagggct atttggttct ccggagtcta acttactgag ttctttgtat 
atattggata ttagccttct atcagttgta gggttggtaa agatcttttc ccaatttgta 
ggtt gccatt ttatcatatt ggcagtgtcg tttgccttac agaagctttg caatcttaag 



300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 

1020 
1080 
1140 
1200 
1260 
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agatcccatt tgtctacagt tgatcttaga ccttaggaca ctagtgttct 
atttcctatg cccatgtgtt cgaggctctt tctcactttt tctcctgtta 
ttctggcttt atatggaggt tcttggtcca cctggacttg aactttgtac 
gaatggatca atttgcatta ttctacatgc agactgctac ttgagccagc 
gaatatgctg tttttttttt ttccccccac tggatgattt taacttcttt 
agatgagcat aagtgtgtgg gtttaattct gaatcttcaa ttctatttca 
tgcctttctc tgtgccaaga tgatgaggtt tttatgtata ttgctttgta 
gatggtgatt tcccccataa gatctctgtt gagaatagtt ttggatatcc 
ttgttgttgt tgttttttca aatgaatttg agaattgctc tttttatctc 
tgagttggaa tttgatgggg attgcattga atctgtagat tgcttttggt 
tttttactat gttaatactg acgatccatg agcatgggag atctttccat 
cttctttgat gtcttttttg ggagactaga agttcttatc atacagatct 
ttgttagaat cacaccaagg tattttatat tacttgtgac ttttgtgaag 
attcctttct cagcccattt attttttttg agtagaggaa aactactgat 
taattttaca tccagccact ttgctgaagt tgtttaacag ctgtaggagt 
aaattttagg gtcacttata tatactatca tatcatctgc aaatagtgat 
cctctttttc aatttgtatc ctatttgcat agaaggctaa tatccaattg 
tttgttgtct aattgctcta gctaaatctt caagtactat attgaataga 
acaaaaagaa gcaaacacac ctaagaggag tagacagcag gaaataatca 
tgaaataaac caatttgaaa caaagagaac tatacacata attaagaaaa 
gttctttgag aaaatcaaca agatagataa atccttagcc agacttacca 
agagagaatc taaagtaaca aaacc 



gctcaggaaa 

gattttgtgt 

aaggagataa 

accatttgtt 

gtcaaagatt 

tcgaactacc 

atagagtagg 

tgggtttttt 

tgtgaagatt 

aagatggcca 

cttttgaggt 

ttcacttgct 

ggtattcgca 

ttgtttgagt 

tctctggtgg 

atttcgactt 

aagatctcct 

tagaaagagg 

aatttgggcc 

ccaggagccg 

gagggcatag 



1320 

1380 

1440 

1500 

1560 

1620 

1680 

1740 

1800 

1860 

1920 

1980 

2040 

2100 

2160 

2220 

2280 

2340 

2400 

2460 

2520 

2545 



<210> 44 

<211> 2435 

<212> DNA 

<213> Mus musculus 

<400> 44 

tctctctgtc ctgcttgtca ggaatcagca tgatcatgag ctgtggttag acatgatggt 



60 
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ttacaaaatt ctattgaacc tggcatgaaa aaaaaaagtc tcgggaaata ttctttttaa 
gttttcagtt atttcagaac cttactaaaa tatactgcta cttgctttta aagtcggttt 
tgcgcacata cgtcttttgt tgctttcagg ttctcccatc ctctgtttgt agaaatggga 
cagacaccat ctacaatgta agcacaaaga attgccaaat gtcttaagtt catagatgtc 
ttcatagaag tgatactatt atatttttct ctttctctat tttcccgaag tttttttttt 
ctgaaaaagt tttttttttt aaagaagatt gggtgattga gaaaaatcct ttttgtctct 
ttgccttgat gctgtctttt aaatgcttat tatcattatt aaaaggtttg tgtactttct 
cagtgtttat ttttctctca ttgtttttgt ttgctgtttt cctcaagtac catggtatag 
tttccttgaa actagagggc gacagattct cagctcacaa acggaaggcc aaaatatcca 
ttgttagtca gccacagagg acaatcaagg tggcagagct gcctctagct gataaggtgg 
aatccacaac tgatttgcac tttctcagac aggtatggaa atcttccttc cttcctgttt 
gccttccagc aagctgttcc ctgggccaca aacacacttc attcacaaaa tggaccccaa 
gttgggggac agtttacttg acacagatgt tgtactgtag atgaagaccc aaggaaggaa 
actgctatca gcttgcgtga aatcaattct aatctgccac tcttgcctac agggtctgcg 
gcccttgttt ccaaagaatt gttctgtgga cttaaaggga ctctttcatt ttgaagaaag 
cacccacaga ttgtatcaat gtgatgggat ctcctggaaa gcctggagcc cccaaaccaa 
gg tgggacat tggctatagc taagcacctt taatgagcaa tatgccattt aatgagctta 
cagcgtgatt caatgtgtga ttctcaaatc gcacatgtct tttcttatta tacctagaaa 
gccatgaggg aaacatggta cgaagtgact ttttaagatc caaagtataa agccaggtgg 
tggtggtacc tgctttgaat tcctgcagag acaggtagat ctctgagttt gaagtcagcc 
tgatctatag agtgagtttc aggacagcct gaactacaca gagtctgggg gaataaacaa 
cagcaacaac aagatccaca atatagtatt agaccagtcg attttgttta gttatcagaa 
tgtcagtggt ataacattga accatttctg actgaggtag tcctttcagt aagaagtcat 
ttatttctta agatgagaaa ttacagcgag ggcttcctct gcttcgctga agtgagaagg 
ctcacaggct tgtactatga ggccgacttc agatgatcaa gtccattccg gggtgaacac 
gcacaactgt cttgcagggg ctggaggaca gatcctgccc aggtggatgg ctccttcatt 
ctggctactg tcacatcttg gtcacaagac agaaaggcac ctggaccaca gctacccgag 



120 

180 

240 

300 

360. 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 

1020 

.1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
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cttgcagaga 


acagtaagga acccgacatc acacgtgcct gccgttttcc 


ttattttcac 


1740 


atggaaatgt 


aggcctgcat aagtcattga ataagtattg aacgtctact gtctgtccag 


1800 


ctttgtctaa 


ggagcctaaa gggagtttgt agtttggaat ggatttgtgg 


ctgactttga 


1860 


tacttctgtc 


tgctaaggtc tttgctgtta tggctatact ggtgtcacac 


tcttttctgt 


1920 


ttccttttat 


ttccacgttt taaagaaaaa gtctaacaac tttcctgagt 


tagcagcatg 


1980 


gtggcttctg 


atattttctc atctctcttt gttccccctt caaatgacag 


acatccatgc 


2040 


ctattgattt 


taaagtacct ggtaataaaa ggccttggca tccagcctct 


ggcagtgcag 


2100 


gaggcacgga 


tgcccataaa gccaggcgag taggacaaag gctcactggg taacagtcaa 


2160 


taaaagatga 


cttgtggtca gataatgaca acaaaatcca tctgatccag 


aaggagtttg 


2220 


aaaatgcata 


tgtagcagtc cttgaccaac ctttttattt gataacctgg 


atattatcca 


2280 


aacaatgcaa 


ttacactaat atgcatgcat tcttgttgaa attgagactt 


cagtaatttc 


2340 


tagttaatat 


tgatcatatc ctatacttta tcaaatttaa aagggttgtg 


ctcatcaaag 


2400 


gacaccatta 


agaaaatgaa aggtaaacca tatcc 




2435 


<210> 45 

<211> 1718 

<212> DNA 

<213> Mus musculus 






<400> 45 

aagtgaccca cattactaaa gaatgtcttt caagagagga aaactaaatg 


gccagtaaga 


60 


aaggatcttc 


acaaagactt tccgaggcat tgaggtccaa ttcttcaagg 


tcttcagagg 


120 


ctgccaaact 


caggctccca ccctctcctt ccatgggagt ctgtctaggc 


tgaaatctaa 


180 


gtccaaatga atgaatctgc ttggggggtg ggagcaagcc acagctgtga 


ctctagctca 


240 


ctgggggttt ctgttttata ggcggtcagg atcgagtctc tgaaattatc 


aacttgggac 


300 


taggaaaaca attgatacta tctgatttgt agaatagccc tgggtaggga 


aatactaatt 


360 


ttaaatatca 


ttcatctttc cttttatcca gtctctgcgc tgtaatggaa 


ttaggtgaaa 


420 


ggactgagtc 


caagctgtag tgagctcagg ggagcttgca caccgaactg acaaattaat 


480 


ctgtggacag ccctcatcca gtacttattt agtcaatagt cagtgaatga 


gaaaagggga 


540 


aagcacattt 


ttcctttttc tttctctatt ccttccttcc ttcttccttc 


cttccttcct 


600 


tacttccttc 


cttccttact tccttccttt cttccttctt tccttccttc 


cttcctccct 


660 
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ccctccctcc ctttttctct ctctctctct ctctctctct ctctctctct ctttctttct 
ttctttcttt ctttctttct ttctttcttt ctttcttatt tactcattag aattttttcc 
aatcagcata ctgtgaacct tctgtatctt ttccaaaacc atcaattttc agcttaattg 
tttttccccc cttctgacta aaatgtgaac tcttgaagta aacgtattta catgttaaac 
ttcccagact gagttaagag aattagaaaa ggactcagga gataggaaat atgttcagct 
ctgtggaaga gctcattatg gaacgtttcc caaacagctc tgcaatgaat gcattgtgag 
accacctata atcgatagac acatgtgtaa actgttaaag gaaatactta gcaaggtttc 
agaatttgag gctgatgggg gg^acttctg ctatttgaat ttataaggag caccaggttg 
cagggcacat cacactagca ctcctgtgat cctagcacat gttattatat aagtctgttt 
gtgttagtgg gggtacatat gtgtgcatgt gtgcacatgc agagggcaaa tgtgaacctg 
tgtgtaattc cttagggtac tgcctacctc ggtttttttt aggcacaggc tctcattggc 
tcaggcctcc cttatttggt tatgtttgct agccagtgag cccgagggac ccacctgtct 
ccccagctct gggattcctt ctacctgcca acaggttggg gatttttttg tgtgagttgt 
gaagatccaa ctcagggctt cgtgcttgct aacatagcaa gcatttatta gccatgttat 
catctctgct tcctggggtg tgtccattca ttaacttatt caataagcat tgtgactgaa 
tggtg.aatg ttctactgag aagaacttag aaccaatttt ctgtgctcga atcctgactc 
taaaagatat catctgcttg aacttggcca aatcacttga cttctctgaa ccccagtttc 
ttttttgtag aaggggtaat aaataaataa ggggtagg 



720 

780 
840 
900 
960 

1020 

1080 

1140 

1200 

1260 

1320 

1380 

1440 

1500 

1560 

1620 

1680 

1718 



<210> 46 

<211> 3044 

<212> DNA 

<213> Mus musculus 

<400> 46 

tctttttatt ttatttattt 
agtttttcct tgcttaggag 
acactctaag atatatttga 
gaacacaaag aagctacccg 
tctaagaaca gaaatctgtg 



attttttgac aatttctctt tgcttttatg ggggagagaa 
gaaaatgaca ggtctttact tggttgtttg tgctgagacc 
aaagtacctc agctgaagca gcctctgcta gttcacagaa 
cacaaagctg gataagtcag taagagaaga ttccatggga 
ttttagtgtc atcttattat ccctacctta tttcttggcc 



60 
120 
180 
240 
300 
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tctctgtgct cttgacttta attcagggta agtagccctt gatccccttc 
acacgggctt ccacattggc tacccgacgc ctgcttctgg agactcctct 
tgcctttgca gtggttgccc caaagctgta gtgagacagt gacatggttg 
tgctgtaagc agaacatgtg cggatgaagg tttccttact ccagactgga 
caaccatagt tcttcatata tggcggatac tgtttatatt gcagacagca 
aattatgatc gatgcttatt ttatttcagt ggaagacaca attccatgaa 
acaaatgcca agaaaggacc aggttaaaaa ggttccaaga tcaaaccatc 
caacccgtga ctccagagct gggcagactt ctattttttc catctttttc 
gtcttcagtt taggaagcta atgttcaata cattcctcag gggaggaaat 
gagcggcttc tctcactctt tcaaaagttg attcttctgc cctgcataaa 
ctccagcaac cctaaatatc cagtacatgt gcttatagtt cttacaatac 
acccgaaggt tgacatcctt ctggcactct cagtagagtg tttctgatat 
cacttttatg ttgtatttgg tttctaaatt gaccattagt tagaactata 
tatattatat ataatatctc cacctttttg ttattttaaa ttgcatcatg 
ctttctaaaa tcaaggtttc aacatattcc cagatcacat gcatatgaaa 
aatattatca ataacttcag ccatgcactg cattgttctt gcccccatcc 
gccagtaacc ttgaggattt agggttatcc tatcacctca tgtttatttc 
aaaaatggct gtgctaaact gtataattta taaaatttct agaaattcag 
ctgccactgt gtatgtggca ttaaagtcat tttgtgacat ttactaattc 
tctagttttt ggaccaaact ttgcaggagt ttggaatact ggagactaga 
gggagaataa gaaattgttt ttaggaccta tgaactttat tgaattttat 
ggtgttttgt gactcgaatg ttttttctaa ttgtagtaga gatcaggaag 
cttacccatt taaatgagag aaattattgg aatttaaatt tcttgtgcat 
tggtgtttta acctatatct gatgaaaaaa gtcatttggt tccttttata 
cttttacacc tgaggggttt atgtgtatgt atgtatgtgt gtatgtatgt 
gtatgtatgt attatgcata atatgtacat gtatgtgtag ataatagaac 
cactcatgaa cttgcagtag ctatggttgc ctgcacaaaa cctgctcaag 



ccctcttcag 

gttgagtacc 

cctgagactg 

ggagatgggt 

gtgacagaaa 

tcagaagaca 

ttccaaaact 

gtcattcagt 

agatgaggga 

agtcatgggt 

cttagaaggg 

tctctgattg 

tattgtttta 

ctcagttcaa 

gtttcaggaa 

cccccccccc 

ctacatctag 

tcacaggggt 

acaaaaatgg 

aggaaataga 

tcatgtttag 

attcatataa 

gttaagcaag 

tgtaatgtgc 

atgtatgtat 

cataatgctg 

atcaagctag 



360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
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cc.acttcc a g cteca 9 t g g c. g a ggg .= ac.t ggg =ct cacc=ct. g c tg , g a. g =t g 
g t g ac. g caa gg tc tg c t . g g a g a ggg . 9 a g tc= gt cctt tu^M* g9 t==ct gg t 
a 99 «ca g aa t 9 atcca g ta g at gg cccca c gg cta t9 ta C «.t g9 aca g cact.a=t 9 
9 c«c» Cgg MM WW» aacatatat. Cgg at 9 t.ta WW 
tatacacaca t.ttt... g a tat g = Cg t<= g g9 a. 9g9g tt at g9g atc« apattggtg 
t g t g t.t 9 .a at.aat.eat t gt at.c. Ug t gtg a,atta tc g .. g a.t. aaat.tttt. 
at.aaa t99 a ,t.« 9 ac,t g ata ggg c« t a,t g t,cat attttctctt trttuttt. 
tct t c=aa g t « 9 t«cta, , 9 »« g .t« «ct==t gg c ttt gggt ctc cca«« g ct 
«a t atat=t ata=a g «« ta« gg « gt ^tatcaa taatateata « g c. g .« g 
caatattcc. a tg a g ca«t aaa« 9 c«» ca g «cta« ta.cgcctca g »c«=ca« 
gg . g t 9 a g » aacc tg cc,a g . g t 9 . 9g tt 9 tta, 9 atat a 9 c t a g cct g a 99 =ccc, gt 
g „ 9 ctaco 9 t g „ tg .cc aaa g9 ac.=. t, 9 = t =t 9 ct t 99 ccc. 9 .c c tgg a tg cct 
gg a 9 acc tgt a 99 cc t c tg ctt=acca« tc= t9 ta 999 
gctcg t ggt . tcc tgtg c. g ac« 9999 « « 9 «« t99 «— «« 
t ca 9 c«ctt ccacca 9 c« c C9 tc t cc t c taa t9t999 . aa C99 aect 9 «.=c. t99 t 
««=c«ct ct g .a g tc t9 t g at 9C9 »t a 9ggg a g9 . 9 ctaa 99 a g9 c aac t99 actt 
gctg ce«ct c.. g9 acc=c ~ t g caaac,ac t ™« 
, co cttctat «a*caca 3 t g .a tg9 cta a.« t99 =t= t g cc,«c« g c,a gg c=a g 
aactg a, g .a c.c.ttcact t 9 a t a«aa, acta««ta aa 9 = 



1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3044 



<210> 47 

<211> 3100 

<212> DNA 

<213> Mus musculus 

atg^taLg ctatactgtg tacatagtat atataatata t atgaaatat 
gc aaaaagtc ctgcatgcct caattttctc atccctgaaa ctggaagctt 
ttacaaacag gttccaacat tcctcttttt gtg t ctggtg ccagaactgg 
gtt aacatgg ctgttttgct tgctgcacaa ttcggtttcc atctgtgctt 
aaaattcaat gttgggagat gcttctcaag gttcaatctc agacctttta 



atatccatat 

cacttatcat 

tttggaagct 

atttacagac 

ctttcgttgg 



60 
120 
180 
240 

300 
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tttggttttg gtgccgccga acggtgtcag gtgagcaccg tgagtccgct cttctcccct 360 

tgtgttttcc cctcgtctct gcggatactg tacagcaatg gtcaactttg ccacttgcac 420 

tgagttttga gtcaaaccta ttttcttaaa tgaagttgta acttcggtat aactcaagta 480 

tattgtatat tctttgcttt tagttaaaaa aaaaaaagta aaacatttta gctaattaaa 540 

aagcactcag gtgataatta tgtaggaaaa aaaaaaaaaa acaatcttgc caaataatga 600 

acccatccta ggatgtgtag acaataatct gcttgaatat ttttgtagct cacttcctcc 660 

ccacgtttcc ccaagtaaag ctgaagtgca gatgattcag agctgacact ggatgctcaa 720 

gtcctccaca gggacagagc ggatggctcg aaggactgca gagcaaaaga gcgggagcct 780 

gcggtggtgt gttagaacgc cacaggcact ggtgaggaga cagcagggga ggaattctct 840 

tcatttaagc atttctttct ggcctctgct tagacagcgc tcagaaatgc catgtggtag 900 

ggcctgcttc tttgaaggtc acagctaacc aacccccagc tttcctgccc aggcctggcc 960 

tcagctttca gtggcagccc cctagattaa ttgagctcac caagagtagg aaagagaatg 1020 

gcagaatgga gcctgggatc cacaaggact taggctaatg catttctttc ttccttttac 1080 

ccttccaatg ccctctgtac tcttgaggtt ttgttctgca ccccccctcc cccaagtctc 1140 

ttgtctgaaa gctgcttcat cgaggcatag gacagatacc gtaggagccc tctgccctct 1200 

gcccaagccc tccacccctc acccccacct ttctcccacc ccaggtaata atctgcttcc 1260 

cttcctaaaa actgcttggt ttgcagatct gtcgagcagc ttccttggcc ccagggtatc 1320 

ctggtgcaag ccatgtttac aggaaggcat gccccagggg tcagctccct cctcccaaat 1380 

ggtctctatc tatctgcttc tgttcagcag cctggagacc actccagctg tgcaaggtta 1440 

tccagaaaag tctgatgttg ggatagggta gagggtaatg gggactagat ggatggttga 1500 

ctttgtttct tcctctgtgc cattgtttgg acaatattaa agctgcatgt aaaggggaaa 1560 

gtaatgtatg actagtaggc aaaagtgaaa ccctagtgac acgttctagt agtactaact 1620 

tctttctgta cctgtagtag tactaaaggt ttaagtatgt atgaatgcca gaagttgttg 1680 

attcatgcag tagaatttaa ggggaaattt acttctttta aaataggtcc attttttaaa 1740 

acctgcctct gggttttgag agagagagag agagagagag agagtgtgtg tgtgtgtgtg 1800 

tgtgtgtgtg tgcagatttt gatacagcta tgttggagct gccatcgttg ttacaacgtt I860 

ggtacttcct ggtctactct aacagttccc ttcaccagag gcatgtctcc atgaaaagca 1920 
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ggtaaagcta 
acaggagtca 
cttctaaaga 
tctgcttgtt 
gtttacccac 
gggggccagg 
cgttttcttt 
taaaaactcc 
gtgttttgtt 
tgtcctggaa 
gtctcccaag 
taacaagtaa 
cggggcatgt 
tgattcaaaa 
gtccccagtc 
aaggggaaga 
tcttttcttt 
cctctgtcct 
tgtgctttgc 
acaacttcat 



acgtttagct 
gcagacagta 
aacagaaggc 
ttatgttttc 
tacgtttctg 
gcgtcaggtc 
tcctcttcca 
tatttctttg 
ttggttttgt 
ctcactctgt 
tgctgggatt 
ttttattcag 
ccgtgtcttt 
ggagacctca 
tgtccttccc 
gaagagagga 
caacaataca 
cctgtgtctc 
cgagttgtgt 
gacttactga 



ccttgcgaat 
tttttttttt 
ttgtattctg 
ccagcccgtg 
ttgccagtag 
tcctcgtttc 
ccccatctgt 
ttgttgtttt 
ctttgttttt 
agactaggct 
aaaggcgtgc 
tatccaccag 
tactgttcca 
ctggggacca 
tggaatccct 
gagtgtgttt 
ggacactcct 
tgaacagcct 
ttgtgtctct 
ttctggaact 



catggggtac 
taactctctt 
ctcaggccat 
ctgcagctgg 
ctcagaccca 
tctctgccct 
gccatggaaa 
tttttttttt 
tgagacaggg 
ggcctcgaac 
accaccaccg 
gaacagcaac 
aaccattgct 
gaatctaagt 
cagtcaacca 
taaaggaaca 
atccaaaccc 
ccttcaggtt 
gcgttatctg 
aagcagttcc 



tcacagttgc 
attgcctttt 
catcagtgcg 
aggtcttgcc 
tggcagccac 
aactcagccc 
ctagcatttt 
tagtttggtt 
tttctctgta 
tcagaaatcc 
cccggcttct 
tggctttgtg 
tatcaaaatt 
tctttaagtg 
cattccctct 
tttagttgtg 
aggcctcccc 
gcccaggctg 

gggggtgcct 



ctgttatggt 
taagtgacct 
gagaagcctc 
acattccaca 
ttccagggct 
tcatcttcct 
caaaggactc 
ggtttttggg 
tagccctggc 
gcctgcctct 
ttggtatttt 
tagttgctca 
gtttctgagt 
gaattcagac 
gtagaaaaga 
tgtttgaagg 
ctcggagcag 
tgggcaggtg 
tgaataaagt 



1980 

2040 

2100 

2160 

2220 

2280 

2340 

2400 

2460 

2520 

2580 

2640 

2700 

2760 

2820 

2880 

2940 

3000 

3060 

3100 



<210> 48 

<2H> 2023 

<212> DNA 

<213> Mus musculus 

glggaggagg agccgaccgg agagagatgg agcttctggc cagacttcgg aacaagaagg 
acacaacaac cttgtaaggt cttgtagagg cgataactgg cagcctctgc cactggcgga 
tagctgatat aagctagcag ataaggttag ggcaggagat tctttgccca ccgattgtgt 



60 
120 
180 
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tctctggcca gttttaagat 


aataaaacag 


tctatgtttt 


tcttccttgg 


agaaaagctg 


240 


gacagagaaa 


gagggaagcc 


acctgacagc 


ttggtgaagc 


taagcttcga 


ggggctagcg 


300 


ggaatgatca cgatgccgag 


gaaggcacta 


agccaggagc 


gctggcagag 


ccctcgggaa 


360 


ggcaatagcg 


tgttttataa 


aattacacgc 


aacacaactt 


atatgcgaaa 


ataaaacaca 


420 


acgagctcat 


aagcaggtgc 


agggaggact 


gcagggtggc 


atgcctttct 


ccaaggtttt 


480 


tttttttgtt 


ttgttttgtt 


ttttttaaac 


tcatttagac 


acaacagttc 


aggctgacct 


540 


ctgttttgat 


ggatcctgag 


tgaaggctac 


agttggggaa 


caacaacaac 


aactactact 


600 


actacttact 


acttctacta 


cttctactac 


tactactact 


actactacta 


ctactactac 


660 


tagcagcagc 


aactactact 


acttactact 


tctactacta 


ctactagcag 


cagcaactac 


720 


tactacttac 


tacttctact 


actactacta 


gcagcaacta 


ctactactta 


ctacttacta 


780 


ctactactac 


tactactagc 


agcagcagca 


gcagcagcag 


gctggctaga 


gagatggctc 


840 


agcggttaag 


agcacttagc 


tgctcttcca 


gaggttccga 


gttcaaccta 


cttggtggct 


900 


cacaaccatc 


tgtaatggca 


tctgatgccc 


tcttctggtg 


tgtctgaaga 


gagcaacaat 


960 


gtactcatat 


acattaaata 


aataaatctt 


ttaataataa 


taataagcct 


cagaatatat 


1020 


gaccaacttg 


atcttgctct 


gagccaaact 


atttttctat 


gtctatagtc 


aagctgtttt 


1080 


tatgtcagca 


atatgggatg 


gctggcaggc 


atgaaacaaa 


ggggttacag 


ctaagctaac 


1140 


ggtttcatca 


atgatcatgt 


gccctagagc 


ctgggattta 


cctttgagaa 


ttcgtagata 


1200 


gaaacgatgg 


gtttggactg 


ggtcttctcc 


taccatcctg 


ttagtatggt 


ttaaatgtcg 


1260 


ggtttcttta 


acgttggtgg 


aacaggtcag 


aagcggcatg 


catagaaaca 


aagctccaag 


1320 


ctgctgtctg 


aagcatcatt 


cagagttcct 


ttacccttta 


aggaccttca 


ctcagataag 


1380 


actggtcccg 


tgctttaact 


caaaagaatc 


tcggagtgag 


ccaatagtga 


accagagccg 


1440 


gctcttgtgc 


ttgtttccct 


ttgtttttta 


agatttcatt 


ttggtattgg 


agaaatacct 


1500 


gaacagttaa gagcacaaca 


gaaaacccag 


gtcagttctc 


agcaccctca 


tccagtggct 


1560 


cacaagtgtg 


tgtaactcca 


gctcctgacc 


tgggagaatt 


gagaacgcct 


gcctccctgt 


1620 


gtggtgcaca agctcccaca 


tgggtggatg 


gtcctgcacc 


atctttccac 


atgtttacaa 


1680 


tttttttatt 


tttttatttt 


tggtttttcc 


agacagggtt 


tctctgcgta 


gccctgattg 


1740 


tcctggaact 


cactctgtaa 


accaggctgg 


cctcgaactc 


agaaatccgc 


ctgcctctac 


1800 
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ctcccaagtg ctgggattaa aggcatgcgc caccactgcc cggcatggtt acaatttttt i860 

attttgtcgt tatagttcct tttccatttg gtaaaatcta gtcattttac cttgggacaa 1920 

aacctccccc tccccccacc ctgtttttaa aattttattc attttgtttc ttgttttaga 1980 
gtggggcttc atgtagtcca ggtttacctt atgctatgta gcc 



2023 



<210> 49 

<211> 3693 

<212> DNA 

<213> Mus musculus 



.£S£cUg. cttagggtcc agtgaggaaa gcatccaaca tctatgacaa atagtgaata 60 
agaaactgta cattaactat ggagaaagag ctgtgaggga aaggaggcct ttgacatttt 120 

180 



agactatctc ttagcccttc tcctgcctct taattaagag gtgctttctc tttgcactgg 
accctggaaa tgatacagtc atccccgctt ggttccttct ggccggcaac ttctgtctgg 
tcttcccctc ctgcttctag aaaggaaaca agaagtcatt taaagatatt gctgagatca 
gggacgagtc atttcttgtc tcttttctgt ccttctgaag gtcactatga ccagctccac 
ct cagacctt agaagttacc cggtctgagg agaggagaca aagtcgtcta tcactcgctg 
gcttggaagt catgaatctt: cctgccagcg agctatttct ttgcagacac ttactatgtc 
cccagttctg ggttagtgtt ggctctgggc aggaccaatt agggatgcag gacaggaacc 
ataaagcaag cttccaggaa atgaggctcc aagttcacag tcagctttat ttaagggctt 
gttt gcaagc acagggacag ggccgtgcca ggcaagt.aa acattgtccc t gg ccccaa g 
gggatgg atc a tgg c g a gtg ca g aca tgg c ctggtgcaga g a ggg t g a g a agacagcctg 
accactgata aggtaagcct atagacagtt gtcacaagca tgacactgct cactgttgct 
ctctaacctt ggtccagaat cacgggcatt ccatagttgg caactgggct acgtctggtt 
catcgcaagc t g tcat gg aa ggagaatgta t g tca gg a g c cactgtgccc tgccccccac 
cccccccgcc ccactctcag gtctttccca gggcctacct catagtttcc aaacctttca 
agactgcggt aatgccctct ctaccggtac attttaagcc ccgctacttt gtttgttact 
ataaaagtcg catcagacat tcccagaccc caagaggcca gacaaacgcc aactgtcctc 
ctgtggtcct gctagacaca tctagaataa cccagtcagt caatggacag ggg agacatc 
tctgtgtagc c tgg ca ggtg ggggaagtat cccgtgtggg acatacctgc ctgtcctcag 



240 

300 

360 

420 

480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
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cctatgaaaa ggattcacta gagagaattt agttgtgctt ccctataggc tcaaatgtga 540 

tccccaatta tggattgcta cttctcaata accttcagga gcttcacact atagttagtt 600 

ttagggcttt cagatttaag tcacaagagg tacacacgtt cttttcagtg ttatccaggt 660 

gataatgcaa aggaagaggg gcaagcaatt aagtaggaat ttagagagtc agagggagag 720 

tttcaccacc agtgcctctc ttcagaatcc tacattccta gtcatagagc atccttttgt 780 

ttttcttaat ttccatacaa atatcagtaa gcctatttac cttgctctgc ccttggctta 840 

caggctttag gaatctcagg tcaactggag gttatttctt gaatgaacct tcagatttca 900 

ctctggggac caactccctt cctgttgaca cagctaagta cactgaacaa caaaattgaa 960 

atgtgctcta gactccatgg aaccagcgag gttaaatctt tgccaaggtt ctagtgcctc 1020 

ttggtgtact tctggagatt ggagccccaa gcaaagtttt cttggaggac agaatcaaag 1080 

aagataaagg aactaaagtc aaaacagcct ggaaatgaga atgagagatt tcccagcagg 1140 

cactgtctgt aagagccttg atatcattgc attaaagctt cttttccctg tagacaatct 1200 

ctccaccatc atttattgag ggtcaatgct tgtcttctat gtcttcagga gcttagcaga 1260 

gtctctgctc caatgtatat tcttggtctc tctgtctctc tgtctctgtc tctctgtctc 1320 

tctgtctctc tgtctctctg tctctctctc tctctctctc tctctctctc tctctctctc 1380 

tgtgtattaa tcaattgtta ggttgttcca actgtacttt aattccaatg gttgctaaat 1440 

aataatgttt cttttatttt tcatatgtga ctaacatacc agtttttctt tctttttaag 1500 

ctaaaagtgc catggaattc caccaacaaa aggggcttaa gaatttgaaa acaccatttt 1560 

tcagatatag cttaacagca tattaatatc atatgtataa gctgcttaac ctctctgtgc 1620 

ctcactttca tataaattag aattataaat actcactctt aggatgaaga aaaaagggaa 1680 

tggggatgcc ccacttcttc tttttcactc cccctaacac ccaatatgca tgaatgtgac 1740 

ctctcagagt tacttatgaa ctctgctatg tatggagcta aggccgtggc fcatgacctct 1800 

cggaacagca ggtaaagtca gggctcatgt gctcccaatg agtccacatg gtatgcagct 1860 

gctttgtctg tctctggcag tcagtatcca tcacagggtt acagttactc agcgcttcag 1920 

acaggtatct tgtggcttac atgaagaagt ggctcgttcg ctaagaatgt attaattgtt 1980 

ctctctgtgg tcagaactgg gagcgtggca gttcattgaa aggggtacaa tctttgttca 2040 

agtagaacat gagaagagag agaaagagag agagggagag agagagaggg agagagaggg 2100 
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gtgagagaaa gagagagtaa gagattaatt acttcttgac agaatcaaga aggataggaa 
gagcttgcct tagaatgtga atgtaattag gaagataaag gatgaaagaa taactgggaa 
gggggcaaca ttactagaat atcttttaca gacaggaggg tgaagcatca tggtatactc 
tgatgagcct gtggcatgaa cgaagctgag atagcagggg cagagaatgc agtgactaag 
aaaggacttg tccatccttg gagcttaaat gttattttcg gtgcagcata acactgagaa 
ttgatggcat cacctgaaga cggatctgga ctcttagaaa tagctcccac ccactttctt 
gagaccatcc tacacccatg acaagatcta tgtcatggtg aattacagct ttttctttgt 
gtcaaccaaa ctaggttatc catgaatttt cagtgtcctt gtcccaattt cctgaatcac 
aaaagagcac aggaagaatg cccctcccca tgccaaccga atccccctgt ctatggatca 
gcacagcatg tcttaaagcc tgagtgagct taaagagtaa ttgctattgt ttaactttac 
caaggotaat ctatcacacc tcacccagag aacccctgat ccactctttg agcattctct 
gtcctgaggt aaaacataac aagcaaataa ataagaaggg aactgtgtgg aaaccctctt 
tgtcactact gaagcatgtt catttattta gcaaaatgtc cataattttt aaattgcttg 
aatcagccac tggctatttg gtatcttcaa ggggtc.cca tcaggaataa tacttcccca 
ttacctgcaa aaaaaaatat tgaggcaggc ttttgatcac aggaattaat tacatgcaca 
gaatttcatt gctgggagca gcaagcagct: ggttcctgca gggccctggt tgaactctct 
tgccaactcc ccttctatgc ttgatcctcc ctgcacacct acacccttgc tttctttcat 
tatgctccac aggttctatt caatggggga aaattgtaat taaaacattt acaaagcttt 
ccttatgacc gcccttaagg ctgcgaacct tcacaattca atcttttttt ttttttccaa 
ataaggcaca atgacagagt ttccaggaat ttcttcctcg gggactcagg cctcctagaa 
tgatattaat acattaaaaa aaaaaaaaaa acttcacaat gaagctctgg gataaaagga 
gagcacgtat cttcttcaag ggaggggaga atattgtaat gatgactaat tattctcagg 
agccaacagc ttccctggtt gtcagtggga tcagttaaca atggcttagc ttgtctatct 
tccttatttt cctgttaatt attcctacct ctgctaccaa gagaaggggc ttgttttcct 
cttagatgta attagagtaa tgaaagggtt tataatattt atatatttta tt tctaggac 
tctattcaat tttactttca tgcaggacag aatatagaat gcaaaacaga aacctcaagt 
tcctaggttt tgtgaagtct tacagagaaa attggtttca ccataatagc aattaggagg 



2X60 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3060 
3120 
3180 
3240 
3300 
3360 
3420 
3480 
3540 
3600 
3660 
3720 
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gataattcct agatgaacaa cctgaaatta ctcccttaaa ggcaatactt tatataggat 3780 

tttgagaaag gtggggaaga tgaggtgaca attttggtgc attttatttg tgttgatctt 3840 

tgtattttgt gataaagaag gaaggcaaac cattggtgta ttcttctata gacctgcatt 3900 

tgcatatggt ttgtctctgt tgaatagatt ttggtttgga tgacaaatta aatccccagc 3960 

ttccaaacac ccaagttctt ttgttcagaa tttataagcc aggatgccca aatacagctc 4020 

ttcttcaaag gtaaaggggt taagcaaaca gttgctagac aattattctc ctttttatac 4080 

taacaaaacc accttctagc agctcagaac acatagcaaa tagcatttaa aaggtattat 4140 

gccccatcat cacaggcatt tccatggcaa tgaagtgatg tggcacacac aaacaaggat 4200 

gtaccagttt ttttttcaac atgctg 4226 

<210> 51 

<211> 1560 

<212> DNA 

<213> Mus musculus 

<400> 51 

gcctacactt ctgcactatc tgtgatcagg acgacagtcc aaattcaact atatattaac 60 

ttcaattact tgaggtgttt aaaagataaa agtgtactca aggttttcag catctgaaaa 120 

tatatagaga aaaaaattat agcaaactaa tacttcatgt ggaagtatta aatttaaaat 180 

ttaaattatg tacccacaca ccacagttat atctttaaca gtactaacac cagatatgca 240 

gcctaaagta cttctcacag acttgaactc catctacaac taacattaag aaataaaaac 300 

aaaaaccatc ttcataaacc actgatcata aatttctatt ttttgttctc taacttgata 360 

ctatatttaa ttaactgact ccttttgttt aggtatgctt acacctaaaa gatggagatt 420 

gttttgatac taagataaaa ctttgagaat ccttaccaaa ttttaccatt aaaaccctta 480 

gtataaaaga ttcctatgat caaagtctaa tagttctttt taagttgtat ttttaaaata 54 0 

ttaatgatta gatgctccac ctgctgaaga agaatatgac ctggaattcc aaaattgaca 600 

catggttcat aaacgaaagt tacttagaga aaatagcctg aaaaatgaaa aagaacagcg 660 

aatccttgtt ctaaaggaaa gagaaaacct gcatgtgtaa ataagttcca gtggaaggtt 720 

tagaatctgt cctgtgcccc atgctgttta ttaattactg cagtttaaaa caacaacaat 780 

aacaacaagg aggatgtgtg aactgcattg ctccttcatt caggtcagct tggctttctt 840 

cttgccaggg cccttatgcc ctttggcctt tcttctcaga ttgcgcctgt ctactacagg 900 
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aacttccact 
ctgttccagg 
attctgtgga 
tggttcctgg 
gggtgataaa 

attaccatca 
tgttttcttt 
gctttgatcg 
aaacaggtca 
gtgaagacct 
caatttcttc 



tcaatctcct 
tgctcagagg 
aatggcgcgt 
gcctcaggtg 
ttctgagaga 
gacttttcta 
tttgaaactg 
atcaacattt 
gtgaggcatt 
ctggactcct 
aaaattgtct 



ctgagctgct 
acgcttcaga 
cttcaagttc 
ctccctcctc 
ctggttcttc 
tatttgatac 
gggacagatt 
ctgagtcagt 
ttacaatgca 
catttttgtt 
catgttgact 



atccacagct 
ctgttcctct 
tacctctatc 
accctggtct 
gggaaattct 
agtggcaaca 
ttcttcagta 
atcaggtgag 
gttctaaagc 
cagcatgaag 
gtaatacaat 



ttctctgctg 
gactggctct 
agagaactct 
tccactgagg 
ggtaccacag 
ttgtctgagc 
tgtcctttat 
gtctgggaaa 
gattcattct 
acattatttt 
aaaagaaatc 



actgtgtaaa 
tctgtcctgc 
gagtagccga 
aagtagtgtt 
aattgggaaa 
tccccaaaag 
catcagatgg 
agcacacaag 
caatgtcact 
aaatgcaact 
tttatgcccg 



960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 



<210> 52 

<211> 2849 

<212> DNA 

<213> Mus mus cuius 

<400> 52 

gccagcctca cgggcctgct 
ctgagccaaa gaaatactta 
aataaccctg gcttgaactg 
gtcttatata aaatctcgtg 
ctctctctct ctctctctct 
agttcctaag ggaatataga 
ctgtacttcg tgtcttctgt 
catctccctt gcctgccgtt 
gctcatctcc ttcatcactt 
cccttcatgg tattttaagg 
cgttaaattc aatggaagat 
ccttgagtta ttttacattg 



gccactgtct 
agaatttggt 
tttcttttac 
tgttctctct 
ctctctctct 
tctttccctc 
gtgccgtgtt 
ctgtttggag 
ccttttcttg 
gaaaacatgt 
tgaacatcat 
ctgtaaatta 



gccctgtcct 
catagcaatg 
ttcgcaccaa 
ctctctctct 
ggttttgctt 
tccagatctg 
ctgcacaaag 
ggagaacttt 
attttcatat 
aaaagactca 
atgatctggt 
cattctacta 



attgcactgc 
agaaaagtag 
tgcatttatt 
ctctctctct 
catccattca 
tcatagcact 
gcccagccaa 
ccagtgcacc 
gttagagaac 
caaatagatg 
tctatgcagc 
gtaacaatgc 



atctccgaac 
taactaattc 
cttatataaa 
ctctctctct 
tttgtcctag 
ttcttcacat 
taagaacacc 
cagttactgt 
aacaaattag 
aaaatactct 
cttttcttgt 
aaatgactca 



60 
12 0 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
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780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 



tccatacaaa tgcaaattat acctaatttg agattcgact gattattgca tgtgcctttt 
catgtgagca tgtgcatgtt gaggccatgg gacagtctct agtcttacag ctcagtagct 
gtctgtctta tttcttgaga cagtgtctct ttctggcttg gaactcagca agtagggcta 
aactggtagc caaataatcc cagagatcag cctaactccg cctctccagg gcttggacta 
aaatgtgttt cacagtggtt ggtatttaaa gaatgatttc taatgtgggt tctgggaatt 
taccctatag tcacacattt actattggcc cttggattac cacagtttct ttctttctac 
ccaatgctcc cccttttgcc tcttgcttct actaccagta tggggaacac aaactatctg 
tgtcgcctcc aaactgctca tgtgtctggt ggcataaatt tgtttggaca atgtgggaat 
aaagtagaca acaacaacaa caaaaaaaag gaataaaagt ggacaggttg aagacaaagt 
ttacaaggaa acctgacaaa gttggcctat gggttgcatt tggatcaaga atcatgagaa 
tcatgataac ttctgtgtct ggcgtgaaag actgggtctc cagggaacat ggatgtagga 
agaagtgttg tgtgatgagc cgggggaaaa gtccttaaaa tattgttttt aaaaataatt 1440 
gattacatac aggcatatag aaagttaaaa gcgaacatcc ccaggacatg gtaaagattc 1500 
acgtataaaa ttctgcctca agttaccacc cttttcccac ccaggacaga agttcacaaa 
ggacatccac tctgttccca gcttgcagcc attctttgag gcacaattgc agtttttccc 
actagtgagt ctgtgttctt ttgtttatga actggaaaga aagacatgcc ctgaatctcc 1680 
ctcatagact aagaaaaatg ctttctatgg agtatttggt ttccaatggc tggaaattta 174 0 
caggtcactg cttcttaaaa ggcacactgt cccttggaga aggtgaatat gactatcccc 1800 
ccattgtctc tatggtagaa aacatgtaac cattgattgg gtcagagtgt ggcaactgaa i860 
atctggaaaa ggaaatttta aatgactatc ctgacttcac agcctcttgg ccaaatggca i 920 
gtagcattat cattgagcag gttaaggatg ggcaatggtg gttaattgag ttatagtcct 1980 
tgtagatgaa atgtccaaag tgcgcgaggc ctcaggaaag tctaatgctg ttagatgcat 2040 
cctgtataat agaatacctg caaatgtctg taaacaaatt aggattgott tttgagtggg 2i 00 
aatgtatagc tctgcaggga catgcgaaat gtatctggta atgtctgtaa agcacttgaa 2160 
atgttttttt ttcttttagc cacttttcct cagggatact ttgtctcccg tacttctaat 2220 
tccaatgata aagaataatt gtcaattcta cacttagtgg gttgtgcaga gtgactcacg 2280 
aagctattta caacacatcc agaactcacg attacatttc tcatcatgaa gatctgcctt 23 40 



1560 
1620 
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tctaacttgt 

ggtccttagg 

ttggaaagaa 

caaatcactt 

tgccactcaa 

atgcccaatg 

attattctta 

ttttttaaaa 

actgaacctt 



gacatttaga 

gaaaaacttt 

catttggtac 

tacctcctgg 

ggctgtttgt 

aaacctcctt 

aatcttccct 

tagccaagtc 

aaaaaaacaa 



taccactttc 

actaccggag 

ctgacacctg 

actccagctc 

gtgtctcggt 

tttccttcta 

ctgtctatgc 

caacatacat 

aacaaaatg 



agcttgcagg 
tagaagaatg 
gctctcacat 
tcttacctgt 
gagtaatgta 
aattcattgc 
aggtcaattc 
acatgcactt 



agtggcagct 

tagtaaaaag 

ttatgagcta 

aaaatgaaga 

tagttcttga 

aattagtcaa 

agaggttaac 

agattgttaa 



ataacttcgt 

gagcgtaggt 

cacaatctgg 

caatggcacc 

cagtttaatg 

ctcaagagca 

tgatataatt 

atagttcttg 



2400 

2460 

2520 

2580 

2640 

2700 

2760 

2820 

2849 



<210> 53 
<211> 3551 
<212> DNA 

<213> Mus musculus 
<400> 53 

tattcactga agaaacaatt 



gtgctgcagt ggctagatag 
cttcactggc actgccaaat 
gcatttatca tatcatcccc 
aagaaaaaga actgtaatcg 
ctatgaatat agctttacaa 
ggcaatgaat taagtgtttt 
gcgtctgtga acgcggctca 
tggcaactgg cataacaatt 
ttctaaattg tacatctgac 
tcaggcagtg gcagccagga 
gggctgcaat tgtgggagca 
gctgcctcag agcaagacca 
atttaactca acaacagtcc 
tggccatcgg gatcattgaa 



gactttcgct 
agcgagcaga 
gtacttccta 
tactcctcaa 
cacatttaca 
aacagatgct 
tgtggccctc 
taattgtttt 
agcatcctcc 
ttgttaatta 
gctgcttgaa 
gggctgcfcgt 
ggacttgctg 
cattcccacc 
atgaggtagc 



tctctgttgt 
atggcttcct 
tttgttgtgc 
tgcgagcaaa 
tatgcttcta 
gtttaagaaa 
tcatccgtag 
tcacacataa 
agcaatattt 
ggcatgacag 
atgcaaagag 
caagtgccgc 
caaggatcct 
tatctgaact 
aacacaaaag 



caaagtggcc 
cctggctggt 
aagggaattg 
aaaggagaag 
attgttgatt 
agggggaaca 
ctaggagcag 
gttatgcaaa 
tagcaggtta 
aggtggtaaa 
caaggattga 
ctagcagctc 
gccacttaca 
gtttatgttg 
aagttctttt 



cctggtgata 
gggctggcag 
gaacagcgag 
ttgtcaatga 
tggggatttt 
taattttgtg 
tttgtggacc 
tgagctttta 
attgcaaaat 
atagttatct 
ttggatttga 
tgctccagcc 
agcctgcttt 
tacagtttgc 
gggcttgagt 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
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tttaggaccc 


tQcraacrt*"ta ^t*t-t-r«f-^*-i-*- 


caaag tgt 


gtcagcactt 


gccacttaaa 


960 


aaaaaaacaa 


tCft fct~ aOdhs nn1~ aa^napa 


acccccaatt 


tttattgaat 


gtaaatttta 


1020 


g 1 1 a t c c acrcf 


catcaaahah rra j-rta*.*.*.*.- 


cucgugataa 


tttaactttt 


caacatcatt 


1080 


tcctttcatg 


ai «i'3a^»*'yV( i- uyyyy dd t-d 


CCaLtLtyCt 


atttatttat 


ttttaacaaa 


1140 


Qtaacraaaoa 


aaaaaaataa ^hnt-panaar' 


tacgtaaagt 


acattggttt 


actttgggaa 


1200 


atctataaaa 


L -y c *'-c»«i-w^i- yaauaLdyLa 


uacLautCtt 


agattttata 


ttagtaaaga 


1260 


attcttcfcgt 


l " t * t, 33 *— clay UactL- L L L 


taaaugacct 


aaaaaatttc 


tatggacggt 


1320 


ttcatagtct 


y aaL, yy La, 'Ci a. u u uctcty 


cdcaaacaCu 


agtgaataga 


gcatgatgac 


1380 


accrtcrtatat 


CfcCTPt" t*CTt*crf~ rarnahnaan 


*rt 4™ ^r^»* 4^ 4^ 4-» 4-* 

catgctcctt 


gtatttaaat 


tagctaatta 


1440 


catttttctg 


\~_a C rl t~ z^l i~ Cjf~ aparapaant* 
u y ua Lay i, ct\— di^dt^ddy U 


dcaccacaca 


tgaagtgaag 


catgtatgaa 


1500 


Qfcctaatcrtc 


atct*t" t*Cia At" i"papaf-at-af 


LLtctcgact 


aatccagatt 


atgtttaatg 


1560 




a t* A "f" t~ 1" +* P» a •f" *- +- o a3a ~ _ j_ _ 
aLdLLULtdL CLaaaaayta 


aatgttctgc 


atgtgcagct 


gtgctaatat 


1620 


l» w wCivCl V» I* l» 


p}-^flt*p;apat* r*r4-~ <-t4~ ****** 

uttgugacat. gt-gi-anaugg 


agaaagtgcc 


atttatgata 


tgttgtcatc 


1680 


CLd V— Q. d L L L y 


tcatcgagaa caaaaggcct 


aaattcatcc 


acccagtgga 


acattttgtc 


1740 


atatfctatfct 


^aananrat'na a a ♦"■H artfria a 

uddy dy dcy d aoLtagugaa 


cagttgggct 


ctttatatgt 


aggagatgtg 


1800 


aaaataaracr 


aatai.tt.occ augaatggag 


tgcaacatta 


gcatctcagt 


gccactaaat 


1860 


fcafcacagtag 


taatcrcrt"acrt* or - hat"t*r»*ra'l-a 


ydCdLCLddU 


aaatagtttc 


ttgttttccc 


1920 


actttcttta 


fcafccrt~cit"t~ nt - t-t-at-at-rrrrrr» 
*-a. ^y *-^» ULctLaUyy^L 


tagt.Ltcai.a 


ataaaggtga 


tattaattat 


1980 


tggttaactt 


fcttt" a.ciaCTt~cr afrhat"papa 


+- 1~ ora a a p» ♦* o 
LLLydddCtd 


ccgtzacctcg 


aatgagaata 


2040 


aaacgttgtt 


ccaaatcatt acatttacta 


aM"aaah rrr*\~ 


CdCCydLtUC 


cacccactigg 


2100 


fcofcafcttcrcrc 


aaft't'aarti'a anarfl"ht*arp 
aatLLaayLa aycty LL LaCL 


acagtaacgc 


ttcagtggat 


aatatttgga 


2160 


atagagfcaac 


Caft*hhaaf*a a.rrt - l-r«af-r'na 


gggagacuca 


gcagggagca 


tttcaggtgt 


2220 


attacggctt 


ttgtcttgtc aggatgcaca 


atctccatac 


cattagagaa 


aaggcttcag 


2280 


agtcccactc 


atctccgtta ataatgatac 


taacaacaac 


aacaacagca 


acaacagcag 


2340 


cagcagcagc 


agcagcagca gcagcagcag 


cagcagcaaa 


gacaaaataa 


taatctagag 


2400 


cttctccttt 


ccagtaaagt ctgggcagca 


agatagaaag 


cacaggcagg 


tcgagtgttt 


2460 


ttaggaaact 


tgttaagcga ataccatttc 


tgtgggttaa 


atttccatca 


catttttaac 


2520 
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tgtcctaata 
aagagggtca 
ttatagatcc 
tggatacttg 
ataggatgtg 
cagtaaatgt 
cgaatagtac 
ttaaatttca 
ctttagagcc 
gtgtgttatc 
catttgagtt 
gaactgactg 
attccagaaa 
aattgaaagc 
gctgctttaa 
ttcataacaa 
aacaggcaac 
gatctaaagc 



ttgatcaccc 
gctgctgaga 
ccatttcatt 
aagttcattt 
cgattatggt 
acatttaaaa 
attatgtacg 
aaaattgtcc 
taagaaggat 
tgtctatatt 
atttttttcc 
tttaatgttt 
ttaagtcccc 
agaaatggga 
taaggcaaat 
accttgcatt 
cataaaaact 

g 



tacagaggaa 
cggtccagag 
agttgcagag 
tcttcctcac 
atttttgcag 
cataatgtag 
ttgctcttga 
aaagctgagt 
tgtgagaagt 
tgtgatatag 
ctttaagaga 
tctgggcggg 
ctgccattat 
aaaagactgc 
aattcttatt 
attctgcagt 
taaaagcaga 



tgagactgaa 
cagaggctct 
tttcaaggaa 
attaaggcag 
gggcagttta 
ggactcagaa 
tatccttgtc 
ataatcatgg 
gccagtcccc 
gtaattgtgc 
atatttactt 
tatttatggt 
tcggcaagcc 
aatgcaatga 
ggccgctgtg 
tgcatcgaca 
tgtaaatgtc 



gcctggtagt 
catggtaatg 
gagattttct 
gaacgtgaac 
ttccctacat 
atgccagctg 
attttttttt 
tcttctcttt 
ccaggtccag 
tttctttctg 
agctagtatt 
attttctttg 
tttcatacat 
aaatttaatc 
ttaaggtttc 
gctccacttt 
taaaacaagg 



tattagtata 
atgctgctat 
ctaggggaaa 
aaccttcagt 
gtatttgcca 
ctgttttggc 
cttgtaaaaa 
cttccgagtg 
tctgtctaca 
gaattcttga 
cacttaatta 
ctatatttgc 
tagaatgatg 
agcgtcttct 
taatatttaa 
gctgcctgcc 
agaatgatta 



2580 

2640 

2700. 

2760 

2820 

2880 

2940 

3000 

3060 

3120 

3180 

3240 

3300 

3360 

3420 

3480 

3540 

3551 



<210> 54 

<211> 2244 

<212> DNA 

<213> Mus musculus 

<400> 54 

gcaatggagg tggtgttcag 
aagaagtgat ctcagaccat 
tcagagcgag gctaagagct 
actagatatc tgcagcctgt 
ttgctgttag aagctctgct 



acaaagagag tgggtttcat gttcaaggaa gacattctat 
gggctagaga agtggacaga gaattgaacc aggtagtcct 
ttagagtcat ccctagcaca gggaatctgc tgagagagtc 
tctaaggtca agattcttgt caacttcttc tctgaggtgt 
tctagaagct cggttcctag agcagagatg gtcataggtg 
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120 
180 
240 
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gcgaactcca 


aagaggtgat ggctagaagg gacaaaggtt 


ggtctcatga gactaaagtc 


360 


ctagttttga tgtgatctac ctcttttatg 


tatgtgcttt 


ccactgtaat tccatgcact 


420 


agagctaaac 


tgatgatgaa accatgcttt tgaacatcta 


gaaatgtgat ccaaagaaac 


480 


tagagggaca gactgtgttc caatggtcag 


atgcaactct 


ctcactaact gataggaggc 


540 


actgttcttg ttatgttggg ttagtgttta 


ctgtaagtaa 


tgttcctcta gcaaacgcta 


600 


aactactttt 


aattttaata agtcaaagat 


caggcaattt 


aatatgtata tagaatttgt 


660 


aaattaattg 


tataaaataa tttacatata 


cattttataa 


cattatagca catacattta 


720 


tataaacaat 


atagggtaag catttaagct tatttgttga ggtatccaag cctcaacaaa 


780 


gtgtggggta 


agaaacacca atagagatgg 


ctcagcactt 


tgtcctgtgc ctctctctgt 


840 


ggctcactct 


tgtctcctgt cacacactct 


cttcctgaag atggcagccc aagttcacag 


900 


ggcgcatgga 


gccctgtgct tgctatagag 


ccaagaatga 


ccatgaagtc ctgaccctcc 


960 


tgccttcaac 


ttccaggatt ataggtacac 


accacaacgt ctggcttctg aagttctgga 


1020 


aatcaaatcc 


agggctctgt gcatgctagg caagcactct 


ggcaaataag ctttgtctct 


1080 


ctctgaagag agactctctt ttttctttca 


gtacatttgt 


aattgaaaat agacaccact 


1140 


ctcctaactt 


ccttcttcca accccttcta 


tgcaacccca 


ttctccagcc agttggtagc 


1200 


ctcttttcat 


tactattatt ttctatctat 


ctatctatct 


atctatctat ctatctatct 


1260 


atctatctaa 


tctacatatc atctatctat 


tatctatcta 


atatatctat ctatttaatc 


1320 


tataatctat 


catctatcat tctaattatc 


atctagctat 


ctaatctatc tatcatttat 


1380 


ctttgtttct 


atctatcatt tatctttgtt 


tctatctgtc 


tatcaatcat ttatctatgt 


1440 


atgtatgtat 


ctatccatcc atctatctaa 


tctattatct 


atctatctat ctatctatct 


1500 


atctatctat 


ctatctatct atctatctat 


ctatctacct 


acctatctat ctatctatct 


1560 


atctatctat 


ctatctatct atctaatcta 


ttatctatgt 


atcttcatgt atacaaagat 


1620 


atataaatac 


a 9tgtgatga gtgagctcat 


tttcttgttt 


gtatgtatat cctttcaggg 


1680 


atgaccactt 


tgcagtggac aatgataagg tagcttatct 


ttgagaggtc aactctactc 


1740 


ccagcagtca tttgttgtct acagttcttt 


gactaggggc 


agggctgtcc aaaatttgaa 


1800 


tggtagatga 


tatttacata gatgtggtaa 


cttcatccaa 


ttgtatacaa atgttgtatt 


1860 


ctgattaaat ggaagaatta tcaaatagat 


gaaaaccccc 


atcttttaaa taaagtcctt 


1920 
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